So it doesn't seem like this crash has anything to do with NFS after all.
I've been doing package builds in a sandboxctl chroot that access NFS
sources (read-only) but are otherwise entirely confined to a local
filesystem, albiet through sandboxctl's Null mounts. After many core
dumps (mostly from GNU Configure scripts), one eventually caused another
similar looking crash.
This one did a core dump, but savecore didn't think there was enough
free space left in /var/crash to recover it (even though there is enough
space for dozens of the compresed cores if they comrpess as well as the
last one).
(Below is the original crash messages for comparison)
[ 200974.6716318] fatal double fault in supervisor mode
[ 200974.6716318] trap type 13 code 0 rip 0xffffffff80e3c127 cs 0x8 rflags 0x10286 cr2 0xffff9a02af3e6f88
e6f90
[ 200974.6816277] curlwp 0xffff90f14a2e2bc0 pid 1591.1591 lowest kstack 0xffff9a02af3e52c0
kernel: double fault trap, code=0
Stopped in pid 1591.1591 (conftest) at netbsd:radix_tree_gang_lookup_node+0x1a: movq %rdx,ffff)
radix_tree_gang_lookup_node() at netbsd:radix_tree_gang_lookup_node+0x1a
uvm_page_array_fill() at netbsd:uvm_page_array_fill+0x14b
uvm_page_array_fill_and_peek() at netbsd:uvm_page_array_fill_and_peek+0x1e
uvn_findpage() at netbsd:uvn_findpage+0x88
uvn_findpages() at netbsd:uvn_findpages+0xcd
genfs_getpages() at netbsd:genfs_getpages+0x959
VOP_GETPAGES() at netbsd:VOP_GETPAGES+0x58
uvn_get() at netbsd:uvn_get+0x57
ubc_fault() at netbsd:ubc_fault+0x182
uvm_fault_internal() at netbsd:uvm_fault_internal+0x51e
trap() at netbsd:trap+0x4e5
--- trap (number 6) ---
kcopy() at netbsd:kcopy+0x15
uiomove() at netbsd:uiomove+0xb7
ubc_uiomove() at netbsd:ubc_uiomove+0x156
ffs_write() at netbsd:ffs_write+0x251
layer_bypass() at netbsd:layer_bypass+0x102
VOP_WRITE() at netbsd:VOP_WRITE+0x40
vn_rdwr() at netbsd:vn_rdwr+0xcc
coredump_write() at netbsd:coredump_write+0xa0
coredump_elf64() at netbsd:coredump_elf64+0x43a
coredump() at netbsd:coredump+0x650
sigexit() at netbsd:sigexit+0x27c
sendsig_siginfo() at netbsd:sendsig_siginfo+0x323
trapsignal() at netbsd:trapsignal+0x371
trap() at netbsd:trap+0x8e7
--- trap (number 6) ---
400581:
ds 23
es 23
fs 0
gs 0
rdi ffff90eb408bdd58
rsi 0
rbp ffff9a02af3e7080
rbx ffff9a02af3e7190
rdx ffff9a02af3e71b0
rcx 1
rax ffffffff80e3c10d radix_tree_gang_lookup_node
r8 0
r9 1
r10 0
r11 2
r12 ffff90eb408bdd40
r13 1
r14 0
r15 ffff90eb408bdd58
rip ffffffff80e3c127 radix_tree_gang_lookup_node+0x1a
cs 8
rflags 10286
rsp ffff9a02af3e6f90
ss 0
netbsd:radix_tree_gang_lookup_node+0x1a: movq %rdx,ffffffffffffff10(%rbp)
db{3}>
savecore: reboot after panic: reboot forced via kernel debugger
savecore: system went down at Sat Jul 11 19:35:25 2020
savecore: no dump, not enough free space in /var/crash
$ df -h /var/crash/
Filesystem Size Used Avail %Cap MountedOn
/dev/dk2 3.9G 1.5G 2.2G 40% /var
At Thu, 09 Jul 2020 18:03:23 -0700, "Greg A. Woods" <woods%planix.ca@localhost> wrote:
Subject: is this crash while coredumping to NFS known?
>
> Here's what was on the console:
>
> [ 71887.4479952] fatal double fault in supervisor mode
> [ 71887.4479952] trap type 13 code 0 rip 0xffffffff809c5051 cs 0x8 rflags 0x10286 cr2 0xffff8b827c3e4f98 i
> 3e4fa0
> [ 71887.4479952] curlwp 0xffff8693578524c0 pid 29079.29079 lowest kstack 0xffff8b827c3e32c0
> kernel: double fault trap, code=0
> Stopped in pid 29079.29079 (tpgsqltime) at netbsd:ip_output+0x14: movq %rsi,fffffffffffffe68(%rbp
>
> ip_output() at netbsd:ip_output+0x14
> tcp_output() at netbsd:tcp_output+0xc68
> tcp_send_wrapper() at netbsd:tcp_send_wrapper+0x9a
> sosend() at netbsd:sosend+0x7e4
> nfs_send() at netbsd:nfs_send+0x86
> nfs_request() at netbsd:nfs_request+0x3d4
> nfs_readrpc() at netbsd:nfs_readrpc+0x204
> nfs_doio() at netbsd:nfs_doio+0x731
> VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x64
> genfs_getpages() at netbsd:genfs_getpages+0x1400
> nfs_getpages() at netbsd:nfs_getpages+0x5d
> VOP_GETPAGES() at netbsd:VOP_GETPAGES+0x80
> uvm_fault_internal() at netbsd:uvm_fault_internal+0x1895
> trap() at netbsd:trap+0x4e5
> --- trap (number 6) ---
> copyin() at netbsd:copyin+0x2f
> uiomove() at netbsd:uiomove+0xb7
> ubc_uiomove() at netbsd:ubc_uiomove+0x156
> nfs_write() at netbsd:nfs_write+0x129
> VOP_WRITE() at netbsd:VOP_WRITE+0x65
> vn_rdwr() at netbsd:vn_rdwr+0xcc
> coredump_write() at netbsd:coredump_write+0x56
> coredump_elf64() at netbsd:coredump_elf64+0x89c
> coredump() at netbsd:coredump+0x650
> sigexit() at netbsd:sigexit+0x27c
> sendsig() at netbsd:sendsig
> lwp_userret() at netbsd:lwp_userret+0x1c5
> trap() at netbsd:trap+0x9b7
> --- trap (number 6) ---
> 7c5294:
> ds 23
> es 23
> fs 0
> gs 0
> rdi ffff869202438bc0
> rsi 0
> rbp ffff8b827c3e5160
> rbx ffff8693660f4988
> rdx ffff869364f08818
> rcx 400
> rax 0
> r8 0
> r9 ffff869364f087b8
> r10 ffff869202438bc0
> r11 0
> r12 ffff869364a93040
> r13 a0
> r14 ffff869364a930b0
> r15 6c
> rip ffffffff809c5051 ip_output+0x14
> cs 8
> rflags 10286
> rsp ffff8b827c3e4fa0
> ss 0
> netbsd:ip_output+0x14: movq %rsi,fffffffffffffe68(%rbp)
> db{0}> machine cpu
> addr dev id flags ipis spl curlwp
> 0xffffffff8163a800 cpu0 0 3009 0 8 0xffff8693578524c0
> 0xffff8b825ded0000 cpu1 4 f002 0 0 0xffff868c2a81e1c0
> 0xffff8b825e0ec000 cpu2 2 f002 0 4 0xffff86934acd26c0
> 0xffff8b825e16d000 cpu3 6 f002 0 0 0xffff868c2ad6c200
> 0xffff8b825e19e000 cpu4 1 f002 0 0 0xffff868c2a9ec340
> 0xffff8b825e1cf000 cpu5 5 f002 0 0 0xffff868c2aa9d080
> 0xffff8b825e200000 cpu6 3 f002 0 0 0xffff868c2aa8e100
> 0xffff8b825e231000 cpu7 7 f002 0 0 0xffff868c2ab3f180
> db{0}> ps
> PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
> 29079>29079 7 0 1000000 ffff8693578524c0 tpgsqltime
>
--
Greg A. Woods <gwoods%acm.org@localhost>
Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpzY2KHwEwUk.pgp
Description: OpenPGP Digital Signature