I was running a wee test program this morning, which crashed, and it
seems the kernel crashed while trying to write the core file out.
The current working directory, and the target of the core file, is an
NFS mount.
The core file was created, but is empty:
$ ls -l *.core
-rw------- 1 woods ostaff 0 Jul 9 11:17 tpgsqltime.core
The system is running my version of 9.99.64, so it's not quite current,
and thus I wanted to ask if anyone knows if this particular crash is
known of before I send-pr.
I think this is the first time I've had a core dump over NFS since
updating the kernel from 8.99.32. So I'm not sure yet how easily this
is reproduced, but in any case it is a regression.
Here's what was on the console:
[ 71887.4479952] fatal double fault in supervisor mode
[ 71887.4479952] trap type 13 code 0 rip 0xffffffff809c5051 cs 0x8 rflags 0x10286 cr2 0xffff8b827c3e4f98 i
3e4fa0
[ 71887.4479952] curlwp 0xffff8693578524c0 pid 29079.29079 lowest kstack 0xffff8b827c3e32c0
kernel: double fault trap, code=0
Stopped in pid 29079.29079 (tpgsqltime) at netbsd:ip_output+0x14: movq %rsi,fffffffffffffe68(%rbp
ip_output() at netbsd:ip_output+0x14
tcp_output() at netbsd:tcp_output+0xc68
tcp_send_wrapper() at netbsd:tcp_send_wrapper+0x9a
sosend() at netbsd:sosend+0x7e4
nfs_send() at netbsd:nfs_send+0x86
nfs_request() at netbsd:nfs_request+0x3d4
nfs_readrpc() at netbsd:nfs_readrpc+0x204
nfs_doio() at netbsd:nfs_doio+0x731
VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x64
genfs_getpages() at netbsd:genfs_getpages+0x1400
nfs_getpages() at netbsd:nfs_getpages+0x5d
VOP_GETPAGES() at netbsd:VOP_GETPAGES+0x80
uvm_fault_internal() at netbsd:uvm_fault_internal+0x1895
trap() at netbsd:trap+0x4e5
--- trap (number 6) ---
copyin() at netbsd:copyin+0x2f
uiomove() at netbsd:uiomove+0xb7
ubc_uiomove() at netbsd:ubc_uiomove+0x156
nfs_write() at netbsd:nfs_write+0x129
VOP_WRITE() at netbsd:VOP_WRITE+0x65
vn_rdwr() at netbsd:vn_rdwr+0xcc
coredump_write() at netbsd:coredump_write+0x56
coredump_elf64() at netbsd:coredump_elf64+0x89c
coredump() at netbsd:coredump+0x650
sigexit() at netbsd:sigexit+0x27c
sendsig() at netbsd:sendsig
lwp_userret() at netbsd:lwp_userret+0x1c5
trap() at netbsd:trap+0x9b7
--- trap (number 6) ---
7c5294:
ds 23
es 23
fs 0
gs 0
rdi ffff869202438bc0
rsi 0
rbp ffff8b827c3e5160
rbx ffff8693660f4988
rdx ffff869364f08818
rcx 400
rax 0
r8 0
r9 ffff869364f087b8
r10 ffff869202438bc0
r11 0
r12 ffff869364a93040
r13 a0
r14 ffff869364a930b0
r15 6c
rip ffffffff809c5051 ip_output+0x14
cs 8
rflags 10286
rsp ffff8b827c3e4fa0
ss 0
netbsd:ip_output+0x14: movq %rsi,fffffffffffffe68(%rbp)
db{0}> machine cpu
addr dev id flags ipis spl curlwp
0xffffffff8163a800 cpu0 0 3009 0 8 0xffff8693578524c0
0xffff8b825ded0000 cpu1 4 f002 0 0 0xffff868c2a81e1c0
0xffff8b825e0ec000 cpu2 2 f002 0 4 0xffff86934acd26c0
0xffff8b825e16d000 cpu3 6 f002 0 0 0xffff868c2ad6c200
0xffff8b825e19e000 cpu4 1 f002 0 0 0xffff868c2a9ec340
0xffff8b825e1cf000 cpu5 5 f002 0 0 0xffff868c2aa9d080
0xffff8b825e200000 cpu6 3 f002 0 0 0xffff868c2aa8e100
0xffff8b825e231000 cpu7 7 f002 0 0 0xffff868c2ab3f180
db{0}> ps
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
29079>29079 7 0 1000000 ffff8693578524c0 tpgsqltime
I do have a full kernel core dump, but it's 32GB (345M compressed), and
probably contains data I don't want to share.
--
Greg A. Woods <gwoods%acm.org@localhost>
Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpPh5TdyoygT.pgp
Description: OpenPGP Digital Signature