Subject: Re: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,>
From: Matthias Scheler <tron@zhadum.org.uk>
List: netbsd-bugs
Date: 11/04/2007 19:55:02
The following reply was made to PR bin/37236; it has been noted by GNATS.

From: Matthias Scheler <tron@zhadum.org.uk>
To: NetBSD GNATS <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: PR/37236 CVS commit: src/usr.sbin/rpc.lockd
Date: Sun, 4 Nov 2007 19:52:50 +0000

 On Thu, Nov 01, 2007 at 03:00:14PM +0000, Matthias Scheler wrote:
 >  This time the loop in unlock() was executed although fl is NULL. The
 >  crash looks like a race between sigchild_handler() and one of the
 >  dispatch procedures. That should however not happen because of the
 >  calls to siglock() and sigunlock().
 
 I've changed lalloc() to add redzones before and after each "struct file_lock"
 and added various checks e.g. in lfree() and each LIST_FOREACH loop that
 check the redzones. "rpc.lockd" crashed again with this stack trace:
 
 #0  0x0804bc39 in get_alloc (fl=0x8b030210) at lockd_lock.c:470
 470             assert(memcmp(fla->redzone_head, redzone_head_pattern,
 (gdb) where
 #0  0x0804bc39 in get_alloc (fl=0x8b030210) at lockd_lock.c:470
 #1  0x0804cb82 in unlock (lck=0xbfbfe0e4, flags=2) at lockd_lock.c:413
 #2  0x0804b518 in nlm4_unlock_msg_4_svc (arg=0xbfbfe0dc, rqstp=0xbfbfe198)
     at lock_proc.c:1044
 #3  0x0804999d in nlm_prog_4 (rqstp=0xbfbfe198, transp=0x8063080)
     at nlm_prot_svc.c:469
 #4  0xbbb3ef48 in svc_getreq_common () from /usr/lib/libc.so.12
 #5  0xbbb3f04f in svc_getreqset () from /usr/lib/libc.so.12
 #6  0xbbae368b in svc_run () from /usr/lib/libc.so.12
 #7  0x0804a474 in main (argc=Cannot access memory at address 0x20
 ) at lockd.c:211
 
 The reason is heap corruption:
 
 (gdb) print lcklst_head
 $2 = {lh_first = 0x8b030210}
 (gdb) print *(struct file_lock *)0x8b030210
 Cannot access memory at address 0x8b030210
 (gdb) print hostlst_head
 $3 = {lh_first = 0x76b5bb51}
 (gdb) print *(struct host *)0x76b5bb51
 Cannot access memory at address 0x76b5bb51
 
 This rules out the theory about a race condition. These pointers are
 completely invalid.
 
 	Kind regards
 
 -- 
 Matthias Scheler                                  http://zhadum.org.uk/