Subject: b_resid > b_bcount?
To: None <tech-kern@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 03/29/2005 16:04:46
A 2.0 machine at work has been wedging recently.  This time I spent
some time poking at it with ddb, and after chasing through "proc P1 is
blocking on lock L1, which is held by P2, who is blocking on L2, which
is held by...", I wound up at a process that was blocked in a call
chain that went sys_open, vn_open, ufs_setattr, ffs_truncate,
ffs_update, bwrite, biowait, ltsleep.  I looked at the buffer passed to
bwrite/biowait, and it showed flags=CACHE|SCANNED|BUSY, dev=0x80001300.
The dev is slightly odd, because the high bit of the minor number is
set for no visible reason.  (The machine does in fact have root on
ld0a, but /dev/ld0a has minor 0, not 0x80000.)

But the real oddity is that the buffer has b_bufsize=b_bcount=0x2000,
but b_resid=0x258f.  Is it as wrong as it looks for b_resid to be
higher than b_bcount like this, or is there just something I'm not
understanding here?

Also, if anyone has any thoughts on how to figure out (and fix)
whatever's breaking here, I'd welcome them.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B