NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/48212: modunload(8) for nfsserver leaves a dangling callout scheduled



The following reply was made to PR kern/48212; it has been noted by GNATS.

From: Paul Goyette <paul%whooppee.com@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, 
netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/48212: modunload(8) for nfsserver leaves a dangling callout
 scheduled
Date: Sun, 15 Sep 2013 06:31:40 -0700 (PDT)

 On Sun, 15 Sep 2013, Martin Husemann wrote:
 
 > The nfs timer callout should be diesatablished by
 > nfsserver_modcmd(MODULE_CMD_FINI) ->nfs_fini()->nfs_timer_fini().
 >
 > Are there hidden other timers in the code, that I overlooked?
 > Could you verify that above callchain happens on module unload for you?
 
 Further investigation shows that the specific callout is shared between 
 nfs client and server.  In the nfs_timer() callout routine itself, there 
 is a check to see if nfs_srvvec is set, and if so, calls it.  This check 
 and call is protected by the nfs_timer_lock mutex.  And the crash I am 
 seeing is when the nfs_timer() routine tries to grab the mutex!
 
 The actual code in nfsserver_modcmd() that is supposed to handle this is 
 in the call to nfs_timer_srvfini() which also grabs the mutex and then 
 sets nfs_srvvec to NULL.
 
 The crash is 100% reproducible on a 6-core machine, and I have confirmed 
 that nfsserver_modcmd() is definitely being invoked during modunload.
 
 The failing instruction within mutex_vector_enter() (at offset 0x91) is
 
        movq 0x18(%r15), %rax
 
 This corresponds to line 402 in sys/kern/kern_mutex.c
 
 397             /*
 398              * See lwp_dtor() why dereference of the LWP pointer is safe.
 399              * We must have kernel preemption disabled for that.
 400              */
 401             l = (lwp_t *)MUTEX_OWNER(owner);
 402             ci = l->l_cpu;
 
 ddb says that r15 contains a value of 0xfffffffffffffff0 (ie, -0x10) so 
 the effective address of the movq instruction (holding a pointer to l) 
 would be 0x8.
 
 
 
 -------------------------------------------------------------------------
 | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
 | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
 | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
 | Kernel Developer |                          | pgoyette at netbsd.org  |
 -------------------------------------------------------------------------
 


Home | Main Index | Thread Index | Old Index