NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/53928: modules/t_builtin:disable test case randomly fails
The following reply was made to PR kern/53928; it has been noted by GNATS.
From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc:
Subject: Re: kern/53928: modules/t_builtin:disable test case randomly fails
Date: Thu, 31 Jan 2019 13:02:05 +0700
Date: Wed, 30 Jan 2019 16:30:01 +0000 (UTC)
From: Andreas Gustafsson <gson%gson.org@localhost>
Message-ID: <20190130163001.6B03B7A1F7%mollari.NetBSD.org@localhost>
| I'm still confused as to which part of the system you think the
| bug lies in, and what would be the correct fix, because surely
| running "sysctl -w kern.maxvnodes=2" isn't it.
I have been waiting for someone more knowledgeable to answer
this, before throwing in my random guess ... but from what I
have read here, I suspect that the module unload routine (or
something that it calls) when doing the unload by a specific
request (rather than just the "can I unload this now" autounload
way) needs to stop and wait until all references to the filesystem
(including background kernel threads that only run periodically)
have finished with it.
Whether there is enough mechanism, on all sides, for this to be
done properly or not, I have no idea.
I assume the code has already a method to check that the filesystem
isn't in active use, so if it gets past that, and can lock the filesys
so it cannot be returned to use again (by being mounted while we're
waiting) all it should need to do is delay for a short while if it
gets the EBUSY in question, and try again.
I suspect this belongs inside the vfs interface somewhere, as this
isn't the kind of thing a generic module interface ought to know
about ... but it is possible that the same strategy might be needed
for some other part of the system (device drivers, security modules, ...)
so it might be worth having a specific error that could be returned
to the generic module code to indicate that it should try again soon
(EAGAIN ?) have the VFS code do that when the only reason it is still
busy is because it is waiting on the last cleanup actions to occur,
and then have the generic module code detect that error, wait, and
retry (a small number of times, with gradually longer waits, up to
say a max of a second total) before giving up and returning an error
to the user level.
kre
Home |
Main Index |
Thread Index |
Old Index