kern/51614: vdrain/cache trap panics on NetBSD/amd64-7.0_STABLE

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost
Subject: kern/51614: vdrain/cache trap panics on NetBSD/amd64-7.0_STABLE
From: jdbaker%mylinuxisp.com@localhost
Date: Tue, 8 Nov 2016 22:40:00 +0000 (UTC)

>Number:         51614
>Category:       kern
>Synopsis:       vdrain/cache trap panics on NetBSD/amd64-7.0_STABLE
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 08 22:40:00 +0000 2016
>Originator:     John D. Baker
>Release:        NetBSD/amd64-7.0_STABLE
>Organization:
>Environment:
NetBSD yggdrasil.technoskunk.fur 7.0_STABLE NetBSD 7.0_STABLE (YGGDRASIL) #45: Fri Oct 14 10:10:14 CDT 2016  sysop%yggdrasil.technoskunk.fur@localhost:/r0/build/netbsd-7/obj/amd64/sys/arch/amd64/compile/YGGDRASIL amd64

>Description:
My post to netbsd-users@:

Every so often, my file server panics and reboots--which it did
just a few hours ago.

The system runs a RAIDframe RAID-R across 8 1TB SATA disks with a
single filesystem.  It also monitors a USB-attached UPS via 'apcupsd',
and serves as slave DNS and NTP.

The saved core reports, via 'crash':

$ crash -N netbsd.6 -M netbsd.6.core
Crash version 7.0_STABLE, image version 7.0_STABLE.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
_KERNEL_OPT_NAGR() at 0
vpanic() at vpanic+0x145
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
cache_purge1() at cache_purge1+0x10f
vclean() at vclean+0xa8
cleanvnode() at cleanvnode+0xd0
vdrain_thread() at vdrain_thread+0x58
crash>

The previous occasion (29 July 2016) showed:

$ crash -N netbsd.5 -M netbsd.5.core 
Crash version 7.0_STABLE, image version 7.0_STABLE.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
_KERNEL_OPT_NAGR() at 0
vpanic() at vpanic+0x145
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
cache_reclaim() at cache_reclaim+0x201
cache_thread() at cache_thread+0x15

Before that, (25 Dec 2015):

$ crash -N netbsd.4 -M netbsd.4.core 
Crash version 7.0_STABLE, image version 7.0_STABLE.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
_KERNEL_OPT_NAGR() at 0
vpanic() at vpanic+0x145
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
uvm_pagefree() at uvm_pagefree+0xd4
genfs_do_putpages() at genfs_do_putpages+0xce0
VOP_PUTPAGES() at VOP_PUTPAGES+0x3a
uvm_pageout() at uvm_pageout+0x2f1

and before that (7 Nov 2015):

$ crash -N netbsd.3 -M netbsd.3.core 
Crash version 7.0_STABLE, image version 7.0_STABLE.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
_KERNEL_OPT_NAGR() at 0
vpanic() at vpanic+0x145
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
ufsquota_free() at ufsquota_free+0x15
ufs_reclaim() at ufs_reclaim+0xaf
ffs_reclaim() at ffs_reclaim+0xa1
VOP_RECLAIM() at VOP_RECLAIM+0x2f
vclean() at vclean+0xa6
cleanvnode() at cleanvnode+0xb8
vdrain_thread() at vdrain_thread+0x58

And the earliest I have saved (6 Nov 2015):

$ crash -N netbsd.2 -M netbsd.2.core 
Crash version 7.0_STABLE, image version 7.0_STABLE.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
_KERNEL_OPT_NAGR() at 0
vpanic() at vpanic+0x145
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
VOP_PUTPAGES() at VOP_PUTPAGES+0x3a
uvm_pageout() at uvm_pageout+0x2f1

Greg Oster reports:

One of my machines started doing
something similar to your last panic:

uvm_fault(0xffffffff81041020, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff809c29bd cs 8 rflags 10202 cr2 12 ilevel
0 rsp fffffe810ed60b00 curlwp 0xfffffe823ce98780 pid 0.95 lowest kstack
0xfffffe810ed5e2c0 panic: trap
cpu6: Begin traceback...
vpanic() at netbsd:vpanic+0x13c
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x96
uvm_pagefree() at netbsd:uvm_pagefree+0xd4
genfs_do_putpages() at netbsd:genfs_do_putpages+0xce0
VOP_PUTPAGES() at netbsd:VOP_PUTPAGES+0x3a
uvm_pageout() at netbsd:uvm_pageout+0x2f1
cpu6: End traceback...
uvm_fault(0xfffffe813bd35e70, 0x0, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff805ac769 cs 8 rflags 10202 cr2 84 ilevel
8 rsp fffffe806abeed98 curlwp 0xfffffe80a5198b60 pid 1766.1 lowest
kstack 0xfffffe806abec2c0

dumping to dev 18,1 (offset=18259895, size=2092553):

but I havn't investigated as to what's up yet....  (it crashed Oct 13,
and then Nov 7 and Nov 8... )

>How-To-Repeat:
See above.  It seems to happen entirely randomly and spontaneously.  Not
sure if there was some unusual activity on the machine at the time.
>Fix:

Prev by Date: Re: port-amd64/49643 (system panics in xrstor() on fxsavel on resume from S3)
Next by Date: PR/51600 CVS commit: src/sys/kern
Previous by Thread: Re: port-amd64/49643 (system panics in xrstor() on fxsavel on resume from S3)
Next by Thread: PR/51600 CVS commit: src/sys/kern
Indexes:

Home | Main Index | Thread Index | Old Index