[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: NFS lockup after UDP fragments getting lost
Edgar Fuß <ef%math.uni-bonn.de@localhost> writes:
> Thanks to riastradh@, this tuned out to be caused by an (UDP, hard)
> HFS mount combined with a mis-configured IPFilter that blocked all but
> the first fragment of a fragmented NFS reply (e.g., readdir) combined
> with a NetBSD design error (or so Taylor says) that a vnode lock may
> be held accross I/O, in this case, network I/O.
Holding a vnode lock across IO seems like a bug to me too. Marking the
vnode as having an in-process operation so others can
lock/read/report-that-status/unlock seems ok. But I'm sure you already
know that vnode locking is hard.
> It looks like the operation to which the reply was lost sometimes
> doesn't get retried. Do we have some weird bug where the first
> fragment arriving stops the timeout but the blocking of the remaining
> fragments cause it to wedge?
Probably not. fragments sit until there's a packet and then the packet
is sent to the stack. So the NFS code is almost certainly totally
unaware of the arrival of the first fragment.
Main Index |
Thread Index |