Subject: Re: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,>
From: Matthias Scheler <tron@zhadum.org.uk>
List: netbsd-bugs
Date: 10/27/2007 20:50:02
The following reply was made to PR bin/37236; it has been noted by GNATS.
From: Matthias Scheler <tron@zhadum.org.uk>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
Date: Sat, 27 Oct 2007 21:45:25 +0100
On Sat, Oct 27, 2007 at 06:45:02PM +0000, Christos Zoulas wrote:
> From: Christos Zoulas <christos@netbsd.org>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
> Date: Sat, 27 Oct 2007 18:41:55 +0000 (UTC)
>
> Module Name: src
> Committed By: christos
> Date: Sat Oct 27 18:41:55 UTC 2007
>
> Modified Files:
> src/usr.sbin/rpc.lockd: lockd_lock.c
>
> Log Message:
> PR/37236: Matthias Scheler: Mac OS X NFS client frequently crashes rpc.lockd(8)
> on NetBSD. Use calloc to allocate the lock as suggested in the PR.
>
>
> To generate a diff of this commit:
> cvs rdiff -r1.26 -r1.27 src/usr.sbin/rpc.lockd/lockd_lock.c
Thanks for the commit. But Martin Husemann guessed correctly that this
doesn't fix the problem. I tried a binary with that change and it
crashed again. But I have a better core dump this time (the line
numbers refer to a "netbsd-4" branch source with lalloc() removed):
#0 0xbbb825b8 in strcmp () from /usr/lib/libc.so.12
#1 0x0804c91f in unlock (lck=0xbfbfe0d4, flags=2) at lockd_lock.c:386
#2 0x0804b508 in nlm4_unlock_msg_4_svc (arg=0xbfbfe0cc, rqstp=0xbfbfe188)
at lock_proc.c:1044
#3 0x0804998d in nlm_prog_4 (rqstp=0xbfbfe188, transp=0x8063080)
at nlm_prot_svc.c:469
#4 0xbbb3ef48 in svc_getreq_common () from /usr/lib/libc.so.12
#5 0xbbb3f04f in svc_getreqset () from /usr/lib/libc.so.12
#6 0xbbae368b in svc_run () from /usr/lib/libc.so.12
#7 0x0804a464 in main (argc=Cannot access memory at address 0x0
Locking at unlock() reveals why it crashed:
(gdb) up
#1 0x0804c91f in unlock (lck=0xbfbfe0d4, flags=2) at lockd_lock.c:386
386 if (strcmp(fl->client_name, lck->caller_name) ||
(gdb) print *fl
Cannot access memory at address 0x202
(gdb) print fl
$1 = (struct file_lock *) 0x202
(gdb) print lck
$2 = (nlm4_lock *) 0xbfbfe0d4
(gdb) print *lck
$3 = {caller_name = 0x8051320 "excalibur.zhadum.org.uk", fh = {n_len = 28,
n_bytes = 0x8051340 ""}, oh = {n_len = 8, n_bytes = 0x8062210 ""},
svid = 251, l_offset = 0, l_len = 0}
It seems that the list of locks got corrupted:
(gdb) print lcklst_head
$1 = {lh_first = 0x60030210}
(gdb) print *(struct file_lock *)0x60030210
Cannot access memory at address 0x60030210
Kind regards
--
Matthias Scheler http://zhadum.org.uk/