Subject: Re: Something locks up in file system on RAID5
To: None <current-users@netbsd.org>
From: Kazushi (Jam) Marukawa <jam@pobox.com>
List: current-users
Date: 10/28/2001 16:41:28
Hi,

I tested old kernels.  I got the same lock-up problem with
Oct 6 kernel also.  The kernel what worked well here was Sep
21 kernel.  Both are 1.5Y, but some changes in the kernel
made from around Sep 21 to Oct 6 are causing this problem.

Please check this out if somebody remember something.

The problem is that all processes try to write over the file
system limit lock up.  This limit is software (user-level)
limit.  After that, any processes accessing the same file
system area (directory) lock up.  After 8 hours, entire file
system became untouchable.  I'm using regular file system
without softdep option on RAID5.  More details are below.
Thanks in advance.

Regards,
-- Kazushi

   On Oct 28, 10:14, Kazushi (Jam) Marukawa wrote:
   > Subject: Re: Something locks up in file system on RAID5
   > The trouble is in the latest current also.  I created new
   > kernel Oct 27, and got the same lock-up problem, when I
   > wrote data to RAID5 disk although it's full.  It's something
   > like this.  I received "write failed, file system is full"
   > message 5 times, and then received the same message once.
   > Kernel gone.
   > 
   >   Oct 28 09:25:19 sou /netbsd: uid 1000 comm perl on /mnt2: file system full
   >   Oct 28 09:25:20 sou last message repeated 5 times
   >   Oct 28 09:25:24 sou /netbsd: uid 1000 comm wget on /mnt2: file system full
   > 
   > I remember Oct 6 kernel or at least Sep 22 kernel wasn't
   > locked up in the same situation.  Now, I'm fscking file
   > system.  I'll try older kernels after fsck.
   > 
   > Ps shows different status this time.  Last time, it was
   > vnlock.
   > 
   >   1000  7783 20305   1  -5   0  2860  3188 biowait  DL+  p5 1:24.86 /usr/bin/perl ...
   > 
   > FYI, my /mnt2 is regular ufs without any option like softdep
   > on RAID5.
   > 
   >    On Oct 26, 23:39, Kazushi (Jam) Marukawa wrote:
   >    > Subject: Something locks up in file system on RAID5
   >    > I'm using Oct 24 current version of NetBSD.  The file system
   >    > on RAID5 just locks up somehow.  Other processes accessed
   >    > that area locks up also.  I don't know this is RAID5 problem
   >    > or general problem.  I just experienced this problem on
   >    > RAID5 file system.
   >    > 
   >    > Oh yes, I was forgetting to mention this.  I experienced
   >    > this lock up after having file system full problem.  All
   >    > processes locked up were writing some huge data and stops
   >    > while it was showing file system full warning.
   >    > 
   >    > 
   >    > BTW, some processes locked up show following different
   >    > information.
   >    > 
   >    > 1000 24172 24122   0  -2   4   324     4 vnlock   DWN+  p7 0:00.00 ls -F
   >    > 1000 24169   290   7 -14   0   476     4 vgone    DW+   p2 0:00.00 du -sk ...
   >    > 1000 23129     1   0 -14   0  2376     4 vget     DW    p7- 0:00.00 /usr/bin/perl ...
   > 
   > Regards,
   > -- Kazushi
   > 
   >-- End of excerpt from Kazushi (Jam) Marukawa