NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/57208: panic related to NPF functionality



>Number:         57208
>Category:       kern
>Synopsis:       panic related to NPF functionality
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jan 31 12:20:00 +0000 2023
>Originator:     he%NetBSD.org@localhost
>Release:        NetBSD 9.2
>Organization:
I Try...
>Environment:
System: NetBSD xxx.xxxx.net 9.2 NetBSD 9.2 (GENERIC) #0: Wed May 12 13:15:55 UTC 2021  mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
	We recently experienced a number of panic()s related to NPF.
	Inspecting the resulting core dumps with "dmesg" reveals:

[   134.887867] fatal page fault in supervisor mode
[   134.887867] trap type 6 code 0x2 rip 0xffffffff80997103 cs 0x8 rflags 0x10286 cr2 0 ilevel 0x4 rsp 0xffff8000660fec80
[   134.887867] curlwp 0xffffa2ae4201e480 pid 0.3 lowest kstack 0xffff8000660fc2c0
[   134.887867] panic: trap
[   134.887867] cpu0: Begin traceback...
[   134.887867] vpanic() at netbsd:vpanic+0x160
[   134.887867] snprintf() at netbsd:snprintf
[   134.887867] startlwp() at netbsd:startlwp
[   134.897870] alltraps() at netbsd:alltraps+0xbb
[   134.897870] thmap_del() at netbsd:thmap_del+0x1d2
[   134.897870] npf_conndb_remove() at netbsd:npf_conndb_remove+0x37
[   134.897870] npf_conn_establish() at netbsd:npf_conn_establish+0x1ca
[   134.897870] npfk_packet_handler() at netbsd:npfk_packet_handler+0x428
[   134.897870] pfil_run_hooks() at netbsd:pfil_run_hooks+0x122
[   134.897870] ipintr() at netbsd:ipintr+0x3a3
[   134.897870] softint_dispatch() at netbsd:softint_dispatch+0xab
[   134.897870] cpu0: End traceback...

[   134.897870] dumping to dev 4,1 (offset=12469831, size=1029687):
[   134.897870] dump 

or

[ 2206938.776755] uvm_fault(0xfffffa6db0377008, 0x0, 2) -> e
[ 2206938.776755] fatal page fault in supervisor mode
[ 2206938.776755] trap type 6 code 0x2 rip 0xffffffff80997103 cs 0x8 rflags 0x10
286 cr2 0 ilevel 0x4 rsp 0xffffd58070e864e0
[ 2206938.776755] curlwp 0xfffffa6d801e8340 pid 1321.3 lowest kstack 0xffffd5807
0e842c0
[ 2206938.776755] panic: trap
[ 2206938.776755] cpu2: Begin traceback...
[ 2206938.776755] vpanic() at netbsd:vpanic+0x160
[ 2206938.776755] snprintf() at netbsd:snprintf
[ 2206938.776755] startlwp() at netbsd:startlwp
[ 2206938.776755] alltraps() at netbsd:alltraps+0xbb
[ 2206938.776755] thmap_del() at netbsd:thmap_del+0x1d2
[ 2206938.786760] npf_conndb_remove() at netbsd:npf_conndb_remove+0x37
[ 2206938.786760] npf_conn_establish() at netbsd:npf_conn_establish+0x1ca
[ 2206938.786760] npfk_packet_handler() at netbsd:npfk_packet_handler+0x428
[ 2206938.786760] pfil_run_hooks() at netbsd:pfil_run_hooks+0x122
[ 2206938.786760] ip6_output() at netbsd:ip6_output+0x11cb
[ 2206938.786760] udp6_output() at netbsd:udp6_output+0x544
[ 2206938.786760] udp6_send_wrapper() at netbsd:udp6_send_wrapper+0x51
[ 2206938.786760] sosend() at netbsd:sosend+0x722
[ 2206938.796763] do_sys_sendmsg_so() at netbsd:do_sys_sendmsg_so+0x211
[ 2206938.796763] do_sys_sendmsg() at netbsd:do_sys_sendmsg+0xac
[ 2206938.796763] sys_sendmsg() at netbsd:sys_sendmsg+0x47
[ 2206938.796763] syscall() at netbsd:syscall+0x157
[ 2206938.796763] --- syscall (number 28) ---
[ 2206938.796763] 74fd1b487aba:
[ 2206938.796763] cpu2: End traceback...

[ 2206938.796763] dumping to dev 4,1 (offset=12469831, size=1029687):
[ 2206938.796763] dump 


	Looking at one of the core dumps with gdb after building
	netbsd.gdb with the -P switch for reproducible builds reveals:

# gdb /usr/obj/sys/arch/amd64/compile/GENERIC/netbsd.gdb
GNU gdb (GDB) 8.3
...
(gdb) target kvm netbsd.6.core
0xffffffff80222aaa in cpu_reboot (howto=howto@entry=260, 
    bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:728
728                     dumpsys();
(gdb) info line *(thmap_del+0x1d2)
Line 876 of "/usr/src/sys/kern/subr_thmap.c"
   starts at address 0xffffffff80997a70 <thmap_del+466>
   and ends at 0xffffffff80997a75 <thmap_del+471>.
(gdb) where
#0  0xffffffff80222aaa in cpu_reboot (howto=howto@entry=260, 
    bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:728
#1  0xffffffff80994a96 in vpanic (fmt=0xffffffff811114f8 "trap", 
    fmt@entry=0xffffffff81111538 "ault", ap=ap@entry=0xffff9400660fea48)
    at /usr/src/sys/kern/subr_prf.c:336
#2  0xffffffff80994b47 in panic (fmt=fmt@entry=0xffffffff81111538 "ault")
    at /usr/src/sys/kern/subr_prf.c:255
#3  0xffffffff80224aed in trap (frame=0xffff9400660feb90)
    at /usr/src/sys/arch/amd64/amd64/trap.c:334
#4  0xffffffff8021d56b in alltraps ()
#5  0xffffffff80997103 in stage_mem_gc (thmap=thmap@entry=0xffffd7202ae60080, 
    addr=18446699131934760928, len=len@entry=24)
    at /usr/src/sys/kern/subr_thmap.c:888
#6  0xffffffff80997a70 in thmap_del (thmap=0xffffd7202ae60080, 
    key=key@entry=0xffffd7202f7a2134, len=len@entry=16)
    at /usr/src/sys/kern/subr_thmap.c:875
#7  0xffffffff80764655 in npf_conndb_remove (cd=cd@entry=0xffffd7202ae60058, 
    ck=ck@entry=0xffffd7202f7a2134) at /usr/src/sys/net/npf/npf_conndb.c:235
#8  0xffffffff80763197 in npf_conn_establish (
    npc=npc@entry=0xffff9400660fee50, di=di@entry=1, global=<optimized out>)
    at /usr/src/sys/net/npf/npf_conn.c:502
#9  0xffffffff8075e6e9 in npfk_packet_handler (npf=0xffffd7202adfdcc0, 
    mp=0xffff9400660fef00, ifp=<optimized out>, di=1)
    at /usr/src/sys/net/npf/npf_handler.c:257
#10 0xffffffff80a4b6b6 in pfil_run_hooks (ph=<optimized out>, 
    mp=mp@entry=0xffff9400660fefe0, ifp=ifp@entry=0xffff9400071db008, 
    dir=dir@entry=1) at /usr/src/sys/net/pfil.c:417
#11 0xffffffff806f1eca in ip_input (m=<optimized out>)
    at /usr/src/sys/netinet/ip_input.c:578
#12 ipintr (arg=<optimized out>) at /usr/src/sys/netinet/ip_input.c:402
#13 0xffffffff8096f549 in softint_execute (l=<optimized out>, s=4, 
    si=0xffff9400660f4230) at /usr/src/sys/kern/kern_softint.c:592
#14 softint_dispatch (pinned=<optimized out>, s=4)
    at /usr/src/sys/kern/kern_softint.c:881
#15 0xffffffff8021d24f in Xsoftintr ()
...
(gdb) up
#1  0xffffffff80994a96 in vpanic (fmt=0xffffffff811114f8 "trap", 
    fmt@entry=0xffffffff81111538 "ault", ap=ap@entry=0xffff9400660fea48)
    at /usr/src/sys/kern/subr_prf.c:336
336             cpu_reboot(bootopt, NULL);
(gdb) up
#2  0xffffffff80994b47 in panic (fmt=fmt@entry=0xffffffff81111538 "ault")
    at /usr/src/sys/kern/subr_prf.c:255
255             vpanic(fmt, ap);
(gdb) up
#3  0xffffffff80224aed in trap (frame=0xffff9400660feb90)
    at /usr/src/sys/arch/amd64/amd64/trap.c:334
334                     panic("trap");
(gdb) up
#4  0xffffffff8021d56b in alltraps ()
(gdb) up
#5  0xffffffff80997103 in stage_mem_gc (thmap=thmap@entry=0xffffd7202ae60080, 
    addr=18446699131934760928, len=len@entry=24)
    at /usr/src/sys/kern/subr_thmap.c:888
888             gc = kmem_intr_alloc(sizeof(thmap_gc_t), KM_NOSLEEP);
(gdb) up
#6  0xffffffff80997a70 in thmap_del (thmap=0xffffd7202ae60080, 
    key=key@entry=0xffffd7202f7a2134, len=len@entry=16)
    at /usr/src/sys/kern/subr_thmap.c:875
875             stage_mem_gc(thmap, THMAP_GETOFF(thmap, leaf), sizeof(thmap_leaf_t));
(gdb) up
#7  0xffffffff80764655 in npf_conndb_remove (cd=cd@entry=0xffffd7202ae60058, 
    ck=ck@entry=0xffffd7202f7a2134) at /usr/src/sys/net/npf/npf_conndb.c:235
235             val = thmap_del(cd->cd_map, ck->ck_key, keylen);
(gdb) up
#8  0xffffffff80763197 in npf_conn_establish (
    npc=npc@entry=0xffff9400660fee50, di=di@entry=1, global=<optimized out>)
    at /usr/src/sys/net/npf/npf_conn.c:502
502                     ret = npf_conndb_remove(conn_db, fw);
(gdb) up
#9  0xffffffff8075e6e9 in npfk_packet_handler (npf=0xffffd7202adfdcc0, 
    mp=0xffff9400660fef00, ifp=<optimized out>, di=1)
    at /usr/src/sys/net/npf/npf_handler.c:257
257                     con = npf_conn_establish(&npc, di,
(gdb) up
#10 0xffffffff80a4b6b6 in pfil_run_hooks (ph=<optimized out>, 
    mp=mp@entry=0xffff9400660fefe0, ifp=ifp@entry=0xffff9400071db008, 
    dir=dir@entry=1) at /usr/src/sys/net/pfil.c:417
417                     ret = (*func)(pfh->pfil_arg, &m, ifp, dir);
(gdb) up
#11 0xffffffff806f1eca in ip_input (m=<optimized out>)
    at /usr/src/sys/netinet/ip_input.c:578
578                     freed = pfil_run_hooks(inet_pfil_hook, &m, ifp, PFIL_IN) != 0;
(gdb) up
#12 ipintr (arg=<optimized out>) at /usr/src/sys/netinet/ip_input.c:402
402                     ip_input(m);
(gdb) up
#13 0xffffffff8096f549 in softint_execute (l=<optimized out>, s=4, 
    si=0xffff9400660f4230) at /usr/src/sys/kern/kern_softint.c:592
592                     (*sh->sh_func)(sh->sh_arg);
(gdb) up
#14 softint_dispatch (pinned=<optimized out>, s=4)
    at /usr/src/sys/kern/kern_softint.c:881
881             softint_execute(si, l, s);
(gdb)


>How-To-Repeat:
	Don't know, this may possibly be network-triggered...
>Fix:
	Don't know, sorry.



Home | Main Index | Thread Index | Old Index