Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Crash on -current in pool_drain()
Under heavy load, and after several hours of building packages, I am
seeing the following crash. I'm doing a bisect to narrow down more,
but it has been happening at least a week ago, with kernel and all
modules build from sources updated on 2015-10-13 at 08:30:00 UTC.
(This is on amd64)
Here's the backtrace from gdb:
(gdb) bt
#0 0xffffffff801196a5 in cpu_reboot (howto=howto@entry=256,
bootstr=bootstr@entry=0x0)
at /build/netbsd-local/src/sys/arch/amd64/amd64/machdep.c:671
#1 0xffffffff80237fd6 in db_sync_cmd (addr=<optimized out>,
have_addr=<optimized out>, count=<optimized out>,
modif=<optimized out>)
at /build/netbsd-local/src/sys/ddb/db_command.c:1359
#2 0xffffffff80238797 in db_command (
last_cmdp=last_cmdp@entry=0xffffffff806b90a0 <db_last_command>)
at /build/netbsd-local/src/sys/ddb/db_command.c:908
#3 0xffffffff80238ade in db_command_loop ()
at /build/netbsd-local/src/sys/ddb/db_command.c:566
#4 0xffffffff8023c16d in db_trap (type=type@entry=6, code=code@entry=0)
at /build/netbsd-local/src/sys/ddb/db_trap.c:90
#5 0xffffffff80116130 in kdb_trap (type=type@entry=6,
code=code@entry=0, regs=regs@entry=0xfffffe810f528ce0)
at /build/netbsd-local/src/sys/arch/amd64/amd64/db_interface.c:227
#6 0xffffffff8011a82f in trap (frame=0xfffffe810f528ce0)
at /build/netbsd-local/src/sys/arch/amd64/amd64/trap.c:287
#7 0xffffffff80100fde in alltraps ()
#8 0xffffffff80333415 in pool_drain (ppp=ppp@entry=0xfffffe810f528e30)
at /build/netbsd-local/src/sys/kern/subr_pool.c:1429
#9 0xffffffff802d1791 in uvm_pageout (arg=<optimized out>)
at /build/netbsd-local/src/sys/uvm/uvm_pdaemon.c:343
#10 0xffffffff80100807 in lwp_trampoline ()
#11 0x0000000000000000 in ?? ()
(gdb) fr 8
#8 0xffffffff80333415 in pool_drain (ppp=ppp@entry=0xfffffe810f528e30)
at /build/netbsd-local/src/sys/kern/subr_pool.c:1429
1429 if (drainpp == NULL) {
(gdb) disass pool_drain
Dump of assembler code for function pool_drain:
0xffffffff803333da <+0>: push %rbp
0xffffffff803333db <+1>: mov %rsp,%rbp
0xffffffff803333de <+4>: push %r13
0xffffffff803333e0 <+6>: push %r12
0xffffffff803333e2 <+8>: push %rbx
0xffffffff803333e3 <+9>: mov %rdi,%r12
0xffffffff803333e6 <+12>: cmpq $0x0,0x38d732(%rip) # 0xffffffff806c0b20 <pool_head>
0xffffffff803333ee <+20>: je 0xffffffff803334a2 <pool_drain+200>
0xffffffff803333f4 <+26>: mov $0xffffffff80732f98,%rdi
0xffffffff803333fb <+33>: callq 0xffffffff8011bbc0 <mutex_enter>
0xffffffff80333400 <+38>: mov 0x38d719(%rip),%rcx # 0xffffffff806c0b20 <pool_head>
0xffffffff80333407 <+45>: mov 0x3ffb92(%rip),%rax # 0xffffffff80732fa0 <drainpp>
0xffffffff8033340e <+52>: xor %ebx,%ebx
0xffffffff80333410 <+54>: test %rax,%rax
0xffffffff80333413 <+57>: je 0xffffffff8033342a <pool_drain+80>
=> 0xffffffff80333415 <+59>: mov (%rax),%rdx
0xffffffff80333418 <+62>: mov %rax,%rbx
0xffffffff8033341b <+65>: mov 0x58(%rbx),%eax
0xffffffff8033341e <+68>: test %eax,%eax
0xffffffff80333420 <+70>: jne 0xffffffff80333442 <pool_drain+104>
0xffffffff80333422 <+72>: mov %rdx,%rax
0xffffffff80333425 <+75>: test %rax,%rax
...
(gdb) list pool_drain
1413 *
1414 * Note, must never be called from interrupt context.
1415 */
1416 bool
1417 pool_drain(struct pool **ppp)
1418 {
1419 bool reclaimed;
1420 struct pool *pp;
1421
1422 KASSERT(!TAILQ_EMPTY(&pool_head));
1423
1424 pp = NULL;
1425
1426 /* Find next pool to drain, and add a reference. */
1427 mutex_enter(&pool_head_lock);
1428 do {
1429 if (drainpp == NULL) {
1430 drainpp = TAILQ_FIRST(&pool_head);
1431 }
1432 if (drainpp != NULL) {
1433 pp = drainpp;
1434 drainpp = TAILQ_NEXT(pp, pr_poollist);
1435 }
1436 /*
1437 * Skip completely idle pools. We depend on at least
1438 * one pool in the system being active.
1439 */
1440 } while (pp == NULL || pp->pr_npages == 0);
1441 pp->pr_refcnt++;
1442 mutex_exit(&pool_head_lock);
Interestingly, the symbol pp doesn't seem to be available here, even
though it controls (partially) the enclosing while (...) loop.
(gdb) print *pp
No symbol "pp" in current context.
The routine's argument doesn't seem to exist here, either
(gdb) print **ppp
No symbol "ppp" in current context.
drainpp seems to point to the end of the LIST at pool_head:
(gdb) print drainpp
$1 = (struct pool *) 0xffffffff8099fb40
(gdb) print pool_head
$2 = {tqh_first = 0xffffffff80724880 <uvm_amap_cache>,
tqh_last = 0xffffffff8099fb40}
I'm not good enough at x86 assembler to decode much further...
Anyone got a clue?
+------------------+--------------------------+-------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+-------------------------+
Home |
Main Index |
Thread Index |
Old Index