NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/50168: Frequent lockups and panics with NetBSD 7/amd64, may be ipfilter-related



>Number:         50168
>Category:       kern
>Synopsis:       Frequent lockups and panics with NetBSD 7/amd64, may be ipfilter-related
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Aug 25 12:00:01 +0000 2015
>Originator:     Stephen Borrill
>Release:        7.0_RC3
>Organization:
Precedence Technologies Ltd
>Environment:
NetBSD netmanager 7.0_RC3 NetBSD 7.0_RC3 (NETMANRAID) #19: Tue Aug 18 08:41:57 BST 2015  root@builder7:/usr/work/netmanager/work/obj/7.0/sys/arch/amd64/compile/NETMANRAID amd64
Lockups also seen with:
NetBSD netmanager 7.0_RC2 NetBSD 7.0_RC2 (NETMANRAID) #16: Thu Aug 6 16:08:43 BST 2015  root@builder7:/usr/work/netmanager/work/obj/7.0/sys/arch/amd64/compile/NETMANRAID amd64
>Description:
After upgrading from NetBSD 5/i386 to 7/amd64, I'm seeing random lockups and panics (I do not think it is the upgrade process at fault). I do not see this with a Xen domU, only on real hardware (with hardware containing both wm and bge NICs). I've seen this on more than one machine. The freezes (requiring a power off) or panics happen soon after/during boot, sometimes while the rc.d scripts are still running, sometimes a couple of minutes afterwards.

I realised the machines were rock-solid in single-user mode and eventually tracked it down to ipfilter rules being loaded. With ipfilter=NO or a simple "pass in all" & "pass out all" ipf.conf, the machine runs fine. It is also reliable with NAT enabled, but loading the previous ipf.conf will cause a freeze within a minute or so.

I managed to get some kernel dumps from a couple of panics:

(gdb) bt
#0  0xffffffff806497b5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at /usr/src/7.0/sys/arch/amd64/amd64/machdep.c:671
#1  0xffffffff80872f22 in vpanic (fmt=fmt@entry=0xffffffff80d27d60 "m_copydata(%p,%d,%d,%p): m=NULL, off=%d (%d), len=%d (%d)", ap=ap@entry=0xfffffe804618eb28) at /usr/src/7.0/sys/kern/subr_prf.c:340
#2  0xffffffff80872fdd in panic (fmt=fmt@entry=0xffffffff80d27d60 "m_copydata(%p,%d,%d,%p): m=NULL, off=%d (%d), len=%d (%d)") at /usr/src/7.0/sys/kern/subr_prf.c:256
#3  0xffffffff8090d53b in m_copydata (m=0x0, off=<optimized out>, len=<optimized out>, vp=0xfffffe80bb07a89c) at /usr/src/7.0/sys/kern/uipc_mbuf.c:897
#4  0xffffffff808a273c in tcp_build_datapkt (mp=<synthetic pointer>, hdrlen=52, len=294, off=<optimized out>, so=0xfffffe80a36e0db0, tp=0xfffffe809de85458) at /usr/src/7.0/sys/netinet/tcp_output.c:520
#5  tcp_output (tp=0xfffffe809de85458) at /usr/src/7.0/sys/netinet/tcp_output.c:1253
#6  0xffffffff808aa67c in tcp_send (nam=<optimized out>, l=<optimized out>, control=0x0, m=0xfffffe80bb07a800, so=0xfffffe80a36e0db0) at /usr/src/7.0/sys/netinet/tcp_usrreq.c:1186
#7  tcp_send_wrapper (a=0xfffffe80a36e0db0, b=0xfffffe80bb07a800, c=<optimized out>, d=0x0, e=<optimized out>) at /usr/src/7.0/sys/netinet/tcp_usrreq.c:2499
#8  0xffffffff809130ce in sosend (so=0xfffffe80a36e0db0, addr=0x0, uio=0xfffffe804618ee18, top=0xfffffe80bb07a800, control=0x0, flags=<optimized out>, l=0xfffffe80ba9890a0)
    at /usr/src/7.0/sys/kern/uipc_socket.c:1054
#9  0xffffffff8088db28 in soo_write (fp=<optimized out>, offset=<optimized out>, uio=<optimized out>, cred=<optimized out>, flags=<optimized out>) at /usr/src/7.0/sys/kern/sys_socket.c:118
#10 0xffffffff808834ae in dofilewrite (fd=fd@entry=34, fp=0xfffffe80ae918cc0, buf=0x1b73000, nbyte=294, offset=<optimized out>, flags=flags@entry=1, retval=retval@entry=0xfffffe804618eeb8)
    at /usr/src/7.0/sys/kern/sys_generic.c:355
#11 0xffffffff808835a9 in sys_write (l=<optimized out>, uap=0xfffffe804618ef00, retval=0xfffffe804618eeb8) at /usr/src/7.0/sys/kern/sys_generic.c:323
#12 0xffffffff8088df9a in sy_call (rval=0xfffffe804618eeb8, uap=0xfffffe804618ef00, l=0xfffffe80ba9890a0, sy=0xffffffff8100d7a0 <sysent+64>) at /usr/src/7.0/sys/sys/syscallvar.h:61
#13 sy_invoke (code=4, rval=0xfffffe804618eeb8, uap=0xfffffe804618ef00, l=0xfffffe80ba9890a0, sy=0xffffffff8100d7a0 <sysent+64>) at /usr/src/7.0/sys/sys/syscallvar.h:85
#14 syscall (frame=0xfffffe804618ef00) at /usr/src/7.0/sys/arch/x86/x86/syscall.c:156
#15 0xffffffff80100691 in Xsyscall ()

#0  0xffffffff806497b5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at /usr/src/7.0/sys/arch/amd64/amd64/machdep.c:671
#1  0xffffffff80872f22 in vpanic (fmt=fmt@entry=0xffffffff80d280e9 "sbappendaddr", ap=ap@entry=0xfffffe8045711d40) at /usr/src/7.0/sys/kern/subr_prf.c:340
#2  0xffffffff80872fdd in panic (fmt=fmt@entry=0xffffffff80d280e9 "sbappendaddr") at /usr/src/7.0/sys/kern/subr_prf.c:256
#3  0xffffffff8091543c in sbappendaddr (sb=sb@entry=0xfffffe80b3d93148, asa=asa@entry=0xfffffe8045711e50, m0=m0@entry=0xfffffe8097e8d600, control=0xfffffe8097e8d600)
    at /usr/src/7.0/sys/kern/uipc_socket2.c:957
#4  0xffffffff808ec230 in udp4_sendup (m=m@entry=0xfffffe80be56c000, off=off@entry=28, src=src@entry=0xfffffe8045711e50, so=0xfffffe80b3d93000) at /usr/src/7.0/sys/netinet/udp_usrreq.c:498
#5  0xffffffff808ecd1f in udp4_realinput (off=28, mp=<synthetic pointer>, dst=0xfffffe8045711e60, src=0xfffffe8045711e50) at /usr/src/7.0/sys/netinet/udp_usrreq.c:639
#6  udp_input (m=0xfffffe80be56c000) at /usr/src/7.0/sys/netinet/udp_usrreq.c:387
#7  0xffffffff80556f36 in ip_input (m=0xfffffe80be56c000) at /usr/src/7.0/sys/netinet/ip_input.c:772
#8  ipintr (arg=<optimized out>) at /usr/src/7.0/sys/netinet/ip_input.c:353
#9  0xffffffff805eb86a in softint_execute (l=<optimized out>, s=<optimized out>, si=<optimized out>) at /usr/src/7.0/sys/kern/kern_softint.c:589
#10 softint_dispatch (pinned=<optimized out>, s=4) at /usr/src/7.0/sys/kern/kern_softint.c:871
#11 0xffffffff8011402f in Xsoftintr ()

/etc/ipf.conf contains:
count in on bge0 from any to any
count out on bge0 from any to any
count out on bge0 proto tcp from any to any port = 80
count out on bge0 proto tcp from any to any port = 3128
count in on wm0 from any to any
count out on wm0 from any to any
count out on wm0 proto tcp from any to any port = 80
count out on wm0 proto tcp from any to any port = 3128
pass out quick on lo0
pass in quick on lo0 from 127.0/8 to any
block in quick proto icmp all icmp-type 13
block out quick proto icmp all icmp-type 14
block in quick proto tcp all flags SF
pass in quick proto icmp all
pass out quick proto icmp all
pass in quick on bge0 from 192.168.1.0/24 to any
pass out quick on bge0 from any to 192.168.1.0/24
block in on wm0 all
block out on wm0 all
pass out on wm0 proto tcp all keep state
pass out on wm0 proto udp all keep state
pass out on wm0 proto gre all keep state
pass in proto udp from any to any port = ntp
pass in proto tcp from any to 192.168.1.84 port = smtp flags S keep state
pass in proto tcp from any to 80.aa.bb.cc port = smtp flags S keep state


>How-To-Repeat:
I'm struggling to get a 'reliable' testbed to demonstrate the problem as I did not have the luxury of time when first investigating. I will add to this PR when I get more information. 
>Fix:



Home | Main Index | Thread Index | Old Index