NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/52043: npf kernel panic on sparc64
>Number: 52043
>Category: kern
>Synopsis: npf kernel panic on sparc64
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Mar 07 05:50:00 +0000 2017
>Originator: Dakotah Lambert
>Release: NetBSD 7.0.2
>Organization:
Earlham College
>Environment:
NetBSD lutra.lutras-hacking.ddns.net 7.0.2 NetBSD 7.0.2 (GENERIC.DEBUG) #0: Mon Mar 6 19:35:56 EST 2017 root@:/var/src/sys/arch/sparc64/compile/GENERIC.DEBUG sparc64
>Description:
I have a Sun Netra T1 AC200 server, UltraSPARC IIe at 500MHz with 1Gb RAM, and two hard drives. I run an SSH server on the machine (public-key only, no passwords), and the SSH log tends to fill up with bad authorization attempts from what I assume are bots. Since SCSI drives are hard to find, I installed fail2ban and configured npf in hopes of reducing the volume of data that gets dumped into this log.
The contents of /etc/npf.conf are:
---
set bpf.jit off
$ext_if=gem0
$local_net=192.168.2.0/25
table <fail2ban> type tree dynamic
group "external" on $ext_if {
pass in final from $local_net
block in final from <fail2ban>
pass out final all
pass all
}
group default {
pass final on lo0 all
block all
}
---
The "set bpf.jit off" was added because npf told me to put it in. The "gem0" is one of the two built-in Ethernet interfaces of the server.
Before configuring npf and allowing the module (the only LKM I use) to load, the server never went down unexpectedly. Unfortunately, its reliability fell from "constantly up" to "crashes after a couple hours to a day" after having made this change.
Since the crash appears in ptree_insert_node_common (backtrace at end of section), I am tempted to believe that changing my table from "tree" to "hash" might act as a work-around, but I have not tested this yet.
$ ident /netbsd | grep ptree.c
$NetBSD: ptree.c,v 1.10 2012/10/06 22:15:09 matt Exp $
$ ident /stand/sparc64/7.0/modules/npf | grep npf_tableset.c
$NetBSD: npf_tableset.c,v 1.22 2014/08/11 01:54:12 rmind Exp $
$ ident /stand/sparc64/7.0/modules/npf | grep npf_ctl.c
$NetBSD: npf_ctl.c,v 1.38.2.3 2015/06/10 16:57:58 snj Exp $
I am not sure where "line 501" comes from, as the assertion that failed appears to be at line 450 in the actual C code.
But following the backtrace, it looks like npf_table_insert has its third parameter set to 0. From npf_ctl.c:
751 case NPF_CMD_TABLE_ADD:
752 error = npf_table_insert(t, nct->nct_data.ent.alen,
753 &nct->nct_data.ent.addr, nct->nct_data.ent.mask);
754 break;
Then "&nct->nct_data.ent.addr" is evaluating to 0 (NULL). Might that be the problem?
---
panic: kernel diagnostic assertion "PTN_LEAF_POSITION(ptn) == id.id_parent_slot" failed: file "../../../../../../lib/libkern/../../../common/lib/libc/gen/ptree.c", line 501
cpu0: Begin traceback...
cpu0: End traceback...
Stopped in pid 1426.1 (npfctl) at netbsd:cpu_Debugger+0x4: nop
db{0}> bt
db{0}> sync
Frame pointer is at 0x12dbbc411
Call traceback:
netbsd:cpu_reboot+0x208(a, 1c99748, 0, 1c99400, 1cd4b60, 1c93800) fp = 12dbbc4d1
netbsd:db_sync_cmd+0x20(100, 0, 1c19c00, 1cb3000, f, 102d3c960) fp = 12dbbc581
netbsd:db_command+0x94(10f7144, 0, ffffffffffffffff, 12dbbcef8, 2, 73) fp = 12dbbc631
netbsd:db_command_loop+0x118(1c16be0, 1c16c40, 0, 1c9b000, 1c16800, 16a3fe8) fp = 12dbbc771
netbsd:db_trap+0x100(10f7148, 0, 18787e0, 1c19c00, 1c16be0, 1c9b000) fp = 12dbbc851
netbsd:kdb_trap+0xdc(101, 0, 1838ac0, e0048000, 1cb0000, 0) fp = 12dbbc911
netbsd:trap+0x4a0(101, 12dbbd3c0, 4, 1c19c00, 1c00000, 1cf3400) fp = 12dbbc9c1
netbsd:1010e40+0(12dbbd3c0, 101, 10f7140, 441d0006, 14bdc60, 1cf36e0) fp = 12dbbcb11
netbsd:vpanic+0x16c(18787e0, 1cf35b0, 1825548, e0048000, 1c19c00, 1c19c00) fp = 12dbbccf1
netbsd:kern_assert+0x34(1825548, 12dbbd6e8, 1cf2000, 1cf35b0, 1cf3400, 104) fp = 12dbbcda1
netbsd:ptree_insert_node_common+0x308(1825548, 1825580, 18c00c0, 18bfcb8, 1f5, 10109ef90) fp = 12dbbce61
npf:npf_table_insert+0x198(100f1c908, 102e43e80, 0, 7fff, 2014000, 16203a0) fp = 12dbbcf41
npf:npfctl_table+0xc8(100f1c908, 4, 12dbbdc94, ff, 0, 16) fp = 12dbbd001
netbsd:cdev_ioctl+0x68(12dbbdc80, 80284e67, 12dbbdc80, 1, 102d3c960, 0) fp = 12dbbd0d1
netbsd:VOP_IOCTL+0x38(c600, 80284e67, 12dbbdc80, 1, 102d3c960, 203bad0) fp = 12dbbd181
netbsd:vn_ioctl+0xa4(1019553a0, 80284e67, 12dbbdc80, 1, 100ee3ec0, 0) fp = 12dbbd261
netbsd:sys_ioctl+0x254(10270c400, 80284e67, 12dbbdc80, 12dbba000, 1, 1019553a0) fp = 12dbbd3c1
netbsd:syscall+0x3a8(0, 12dbbdde0, 1020907d0, 0, 10270c400, 80284e67) fp = 12dbbd501
netbsd:101106c+0(12dbbded0, 4e, fffffffffe559700, 36, 12dbbdf40, 102d3c960) fp = 12dbbd621
netbsd:10cca0+0(3, 80284e67, ffffffffffffbac8, ffffffffffffbadc, 2c, ffffffffffffbadc) fp = ffffffffffffb1c1
dumping to dev 7,1 offset 2098887
dump succeeded
cpu0: rebooting
>How-To-Repeat:
1) Boot server
2) Enable npf and fail2ban
3) Wait
4) After a few hours, the system has crashed
>Fix:
Workaround: Do not load the npf module. This is not satisfactory.
Home |
Main Index |
Thread Index |
Old Index