Subject: Re: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,>
From: Matthias Scheler <tron@zhadum.org.uk>
List: netbsd-bugs
Date: 10/27/2007 20:50:02
The following reply was made to PR bin/37236; it has been noted by GNATS.

From: Matthias Scheler <tron@zhadum.org.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
Date: Sat, 27 Oct 2007 21:45:25 +0100

 On Sat, Oct 27, 2007 at 06:45:02PM +0000, Christos Zoulas wrote:
 > From: Christos Zoulas <christos@netbsd.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
 > Date: Sat, 27 Oct 2007 18:41:55 +0000 (UTC)
 > 
 >  Module Name:	src
 >  Committed By:	christos
 >  Date:		Sat Oct 27 18:41:55 UTC 2007
 >  
 >  Modified Files:
 >  	src/usr.sbin/rpc.lockd: lockd_lock.c
 >  
 >  Log Message:
 >  PR/37236: Matthias Scheler: Mac OS X NFS client frequently crashes rpc.lockd(8)
 >  on NetBSD. Use calloc to allocate the lock as suggested in the PR.
 >  
 >  
 >  To generate a diff of this commit:
 >  cvs rdiff -r1.26 -r1.27 src/usr.sbin/rpc.lockd/lockd_lock.c
 
 Thanks for the commit. But Martin Husemann guessed correctly that this
 doesn't fix the problem. I tried a binary with that change and it
 crashed again. But I have a better core dump this time (the line
 numbers refer to a "netbsd-4" branch source with lalloc() removed):
 
 #0  0xbbb825b8 in strcmp () from /usr/lib/libc.so.12
 #1  0x0804c91f in unlock (lck=0xbfbfe0d4, flags=2) at lockd_lock.c:386
 #2  0x0804b508 in nlm4_unlock_msg_4_svc (arg=0xbfbfe0cc, rqstp=0xbfbfe188)
     at lock_proc.c:1044
 #3  0x0804998d in nlm_prog_4 (rqstp=0xbfbfe188, transp=0x8063080)
     at nlm_prot_svc.c:469
 #4  0xbbb3ef48 in svc_getreq_common () from /usr/lib/libc.so.12
 #5  0xbbb3f04f in svc_getreqset () from /usr/lib/libc.so.12
 #6  0xbbae368b in svc_run () from /usr/lib/libc.so.12
 #7  0x0804a464 in main (argc=Cannot access memory at address 0x0
 
 Locking at unlock() reveals why it crashed:
 
 (gdb) up
 #1  0x0804c91f in unlock (lck=0xbfbfe0d4, flags=2) at lockd_lock.c:386
 386                     if (strcmp(fl->client_name, lck->caller_name) ||
 (gdb) print *fl
 Cannot access memory at address 0x202
 (gdb) print fl
 $1 = (struct file_lock *) 0x202
 (gdb) print lck
 $2 = (nlm4_lock *) 0xbfbfe0d4
 (gdb) print *lck
 $3 = {caller_name = 0x8051320 "excalibur.zhadum.org.uk", fh = {n_len = 28, 
     n_bytes = 0x8051340 ""}, oh = {n_len = 8, n_bytes = 0x8062210 ""}, 
   svid = 251, l_offset = 0, l_len = 0}
 
 It seems that the list of locks got corrupted:
 
 (gdb) print lcklst_head
 $1 = {lh_first = 0x60030210}
 (gdb) print *(struct file_lock *)0x60030210
 Cannot access memory at address 0x60030210
 
 
 
 	Kind regards
 
 -- 
 Matthias Scheler                                  http://zhadum.org.uk/