Subject: Something locks up in file system on RAID5
To: None <current-users@netbsd.org>
From: Kazushi (Jam) Marukawa <jam@pobox.com>
List: current-users
Date: 10/26/2001 23:39:32
I'm using Oct 24 current version of NetBSD.  The file system
on RAID5 just locks up somehow.  Other processes accessed
that area locks up also.  I don't know this is RAID5 problem
or general problem.  I just experienced this problem on
RAID5 file system.

Oh yes, I was forgetting to mention this.  I experienced
this lock up after having file system full problem.  All
processes locked up were writing some huge data and stops
while it was showing file system full warning.


BTW, some processes locked up show following different
information.

1000 24172 24122   0  -2   4   324     4 vnlock   DWN+  p7 0:00.00 ls -F
1000 24169   290   7 -14   0   476     4 vgone    DW+   p2 0:00.00 du -sk ...
1000 23129     1   0 -14   0  2376     4 vget     DW    p7- 0:00.00 /usr/bin/perl ...

ktrace of ls -F shows:

 24240 ktrace   NAMI  "/bin/ls"
 24240 ls       EMUL  "netbsd"
 24240 ls       RET   execve JUSTRETURN
 24240 ls       CALL  issetugid
 24240 ls       RET   issetugid 0
 24240 ls       CALL  ioctl(0x1,TIOCGETA,0xbfbfcdbc)
 24240 ls       RET   ioctl 0
 24240 ls       CALL  ioctl(0x1,TIOCGWINSZ,0xbfbfce30)
 24240 ls       RET   ioctl 0
 24240 ls       CALL  getuid
 24240 ls       RET   getuid 1000/0x3e8
 24240 ls       CALL  __sysctl(0xbfbfccc0,0x2,0xbfbfccb8,0xbfbfccbc,0,0)
 24240 ls       RET   __sysctl 0
 24240 ls       CALL  readlink(0x8087f60,0xbfbfcd18,0x3f)
 24240 ls       NAMI  "/etc/malloc.conf"
 24240 ls       RET   readlink -1 errno 2 No such file or directory
 24240 ls       CALL  mmap(0,0x1000,0x3,0x1002,0xffffffff,0,0,0)
 24240 ls       RET   mmap 1208520704/0x48089000
 24240 ls       CALL  break(0x8090adc)
 24240 ls       RET   break 0
 24240 ls       CALL  break(0x8091adc)
 24240 ls       RET   break 0
 24240 ls       CALL  break(0x8092000)
 24240 ls       RET   break 0
 24240 ls       CALL  break(0x8093000)
 24240 ls       RET   break 0
 24240 ls       CALL  break(0x8094000)
 24240 ls       RET   break 0
 24240 ls       CALL  break(0x8095000)
 24240 ls       RET   break 0
 24240 ls       CALL  __lstat13(0x8094140,0x8094148)
 24240 ls       NAMI  "work"

Then, just hang.  Now, I'm retrieving current version to
compile kernel again and restart the system.  I'll tell you
if I noticed differences.  I appreciate to hear any related
issues.  Thanks.

-- Kazushi