NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/53928: modules/t_builtin:disable test case randomly fails



The following reply was made to PR kern/53928; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/53928: modules/t_builtin:disable test case randomly fails
Date: Thu, 31 Jan 2019 13:02:05 +0700

     Date:        Wed, 30 Jan 2019 16:30:01 +0000 (UTC)
     From:        Andreas Gustafsson <gson%gson.org@localhost>
     Message-ID:  <20190130163001.6B03B7A1F7%mollari.NetBSD.org@localhost>
 
   |  I'm still confused as to which part of the system you think the
   |  bug lies in, and what would be the correct fix, because surely
   |  running "sysctl -w kern.maxvnodes=2" isn't it.
 
 I have been waiting for someone more knowledgeable to answer
 this, before throwing in my random guess ... but from what I
 have read here, I suspect that the module unload routine (or
 something that it calls) when doing the unload by a specific
 request (rather than just the "can I unload this now" autounload
 way) needs to stop and wait until all references to the filesystem
 (including background kernel threads that only run periodically)
 have finished with it.
 
 Whether there is enough mechanism, on all sides, for this to be
 done properly or not, I have no idea.
 
 I assume the code has already a method to check that the filesystem
 isn't in active use, so if it gets past that, and can lock the filesys
 so it cannot be returned to use again (by being mounted while we're
 waiting) all it should need to do is delay for a short while if it
 gets the EBUSY in question, and try again.
 
 I suspect this belongs inside the vfs interface somewhere, as this
 isn't the kind of thing a generic module interface ought to know
 about ... but it is possible that the same strategy might be needed
 for some other part of the system (device drivers, security modules, ...)
 so it might be worth having a specific error that could be returned
 to the generic module code to indicate that it should try again soon
 (EAGAIN ?) have the VFS code do that when the only reason it is still
 busy is because it is waiting on the last cleanup actions to occur,
 and then have the generic module code detect that error, wait, and
 retry (a small number of times, with gradually longer waits, up to
 say a max of a second total) before giving up and returning an error
 to the user level.
 
 kre
 


Home | Main Index | Thread Index | Old Index