Subject: kern/20138: gcore over nfs causes a panic
To: None <gnats-bugs@gnats.netbsd.org>
From: None <nathanw@wasabisystems.com>
List: netbsd-bugs
Date: 01/31/2003 15:09:09
>Number:         20138
>Category:       kern
>Synopsis:       gcore or PT_DUMPCORE over nfs causes a panic
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jan 31 12:10:01 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Nathan J. Williams
>Release:        NetBSD 1.6M
>Organization:
	Wasabi Systems, Inc.
>Environment:
System: NetBSD crash-test-dummy.nathanw.com 1.6M NetBSD 1.6M (CTD) #5: Wed Jan 29 17:57:47 EST 2003     nathanw@marvin-the-martian.nathanw.com:/u1/nbsd/src/sys/arch/i386/compile/CTD i386
Architecture: i386
Machine: i386
>Description:

Using the new gcore(1) (or any other invocation of
ptrace(PT_DUMPCORE)) that causes the core dump to be written into NFS
triggers a panic in nfs_write(), in the (#ifdef DIAGNOSTIC) check at
nfs_bio.c, line 529:

	if (uio->uio_segflg == UIO_USERSPACE && uio->uio_procp != curproc)
		panic("nfs_write proc");

since curproc is the process calling ptrace() and uio->uio_procp is
the process whose core is being dumped.

Traceback:

#8  0xc0102c1c in calltrap ()
#9  0xc01d72f9 in panic (fmt=0xc029b0d0 "nfs_write proc")
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/subr_prf.c:227
#10 0xc016969b in nfs_write (v=0xc6b4681c)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../nfs/nfs_bio.c:530
#11 0xc01fb837 in VOP_WRITE (vp=0xc6b326f8, uio=0xc6b46870, ioflag=9, 
    cred=0xc0566980)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/vnode_if.c:458
#12 0xc01fad83 in vn_rdwr (can not access 0x804a000, invalid translation (invalid PDE)
can not access 0x804a000, invalid translation (invalid PDE)
can not access 0x804a000, invalid translation (invalid PDE)
can not access 0x804a000, invalid translation (invalid PDE)
can not access 0x804a000, invalid translation (invalid PDE)
can not access 0x804a000, invalid translation (invalid PDE)
rw=UIO_WRITE, vp=0xc6b326f8, 
    base=0x804a000 <Address 0x804a000 out of bounds>, len=8192, offset=4096, 
    segflg=UIO_USERSPACE, ioflg=9, cred=0xc0566980, aresid=0x0, p=0xc698cc84)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/vfs_vnops.c:394
#13 0xc01b57b2 in coredump_writesegs_elf32 (p=0xc698cc84, vp=0xc6b326f8, 
    cred=0xc0566980, us=0xc6b4690c)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/core_elf32.c:273
#14 0xc0213744 in uvm_coredump_walkmap (p=0xc698cc84, vp=0xc6b326f8, 
    cred=0xc0566980, func=0xc01b5774 <coredump_writesegs_elf32>, 
    cookie=0xc6b4697c)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../uvm/uvm_glue.c:759
#15 0xc01b56a0 in coredump_elf32 (l=0xc693c990, vp=0xc6b326f8, cred=0xc0566980)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/core_elf32.c:203
#16 0xc01c9b58 in coredump (l=0xc693c990)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/kern_sig.c:1673
#17 0xc01dce45 in sys_ptrace (l=0xc693ca18, v=0xc6b46f80, retval=0xc6b46f78)
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../kern/sys_process.c:319
#18 0xc0230a7b in syscall_plain (frame={tf_gs = 31, tf_fs = 31, tf_es = 31, 
      tf_ds = 31, tf_edi = -1077937740, tf_esi = 205, tf_ebp = -1077937852, 
      tf_ebx = 1, tf_edx = 0, tf_ecx = 0, tf_eax = 26, tf_trapno = 3, 
      tf_err = 2, tf_eip = 1208453018, tf_cs = 23, tf_eflags = 582, 
      tf_esp = -1077937896, tf_ss = 31, tf_vm86_es = 0, tf_vm86_ds = 0, 
      tf_vm86_fs = 0, tf_vm86_gs = 0})
    at /nbsd/src/sys/arch/i386/compile/CTD/../../../../arch/i386/i386/syscall.c:156
#19 0xc0100b1f in syscall1 ()


>How-To-Repeat:

1 crash-test-dummy:nathanw>pwd
/m/home/nathanw
2 crash-test-dummy:nathanw>mount
/dev/wd0a on / type ffs (local)
/dev/wd0e on /usr type ffs (local)
procfs on /proc type procfs (local)
mtm:/u1 on /m type nfs
3 crash-test-dummy:nathanw>cat &
[1] 205
4 crash-test-dummy:nathanw>gcore 205

>Fix:

Unknown. Perhaps the DIAGNOSTIC check can simply be considered wrong?
I'm not sure exactly what problem it's supposed to detect, and it goes
back all the way to revision 1.1.
>Release-Note:
>Audit-Trail:
>Unformatted: