Subject: kern/21954: panic in in_pcbconnect
To: None <gnats-bugs@gnats.netbsd.org>
From: Frank Kardel <kardel@acm.org>
List: netbsd-bugs
Date: 06/22/2003 09:31:34
>Number:         21954
>Category:       kern
>Synopsis:       panic in in_pcbconnect() (-current around 20030620)
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jun 22 07:32:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Frank Kardel
>Release:        NetBSD 1.6U -current 20030620
>Organization:
	
>Environment:
	
	
System: NetBSD pip 1.6U NetBSD 1.6U (PIP) #4: Fri Jun 20 08:01:09 MEST 2003 root@pip:/fs/IC35L120AVV207-0-e/src/NetBSD/netbsd/sys/arch/i386/compile/obj.i386/PIP i386
Architecture: i386
Machine: i386
>Description:
for about 4 days i am experiencing panics in in_pcbconnect() pretty regularly.
dmesg output:
<6>Connection attempt to TCP [::0001]:113 from [::0001]:50615
uvm_fault(0xe9231794, 0, 0, 1) -> 0xe
fatal page fault in supervisor mode
trap type 6 code 0 eip c0112658 cs 8 eflags 10246 cr2 60 ilevel 5
panic: trap
Begin traceback...
trap() at netbsd:trap+0x28e
--- trap (number 6) ---
in_pcbconnect(c14d3b64,c2130800,2400,e4e7ee4c,0) at netbsd:in_pcbconnect+0x1c0
udp_usrreq(c16b3d94,9,c14df900,c2130800,0) at netbsd:udp_usrreq+0x2d1
sosend(c16b3d94,c2130800,e4e7ee80,c14df900,0) at netbsd:sosend+0x4e8
sendit(e94b1874,9,e4e7ef24,0,e4e7ef78) at netbsd:sendit+0x169
sys_sendmsg(e4e33584,e4e7ef80,e4e7ef78,c0309c38,0) at netbsd:sys_sendmsg+0xb3
syscall_plain(2b,2b,2b,2b,fffffffc) at netbsd:syscall_plain+0xab
End traceback...
---Type <return> to continue, or q <return> to quit---
syncing disks... 7 3 3 3 3 panic: setrunqueue
Begin traceback...
setrunqueue(e4e33c84,c176cce4,c16e6800,c176d068,4) at netbsd:setrunqueue+0x2a
setrunnable(e4e33c84,5ac2d8d3,e4199c60,c16e68dc,e4dd3340) at netbsd:setrunnable+0xc1
itimerfire(e4dd3340,0,0,c176cce4,e4dd3340) at netbsd:itimerfire+0xc0
realtimerexpire(e4dd3340,c0471540,c024e72a,e4e7ec04,c16e89e0) at netbsd:realtimerexpire+0x15
softclock(0,e4e7ebf8,c02f760a,e4e7ec04,0) at netbsd:softclock+0x1ee
softintr_dispatch(0,cab20010,e94b0030,e4e70010,c0290010) at netbsd:softintr_dispatch+0xb1
Xsoftclock() at netbsd:Xsoftclock+0x25
--- interrupt ---
vfs_shutdown(e4e7eccc,1,fff9,c03f5169,c0271efc) at netbsd:vfs_shutdown+0xe6
cpu_reboot(100,0,e4e7ed10,c030993e,0) at netbsd:cpu_reboot+0x3b
panic(c041591e,c04158b3,e4e7ed18,1,5) at netbsd:panic+0x12f
trap() at netbsd:trap+0x28e
--- trap (number 6) ---
in_pcbconnect(c14d3b64,c2130800,2400,e4e7ee4c,0) at netbsd:in_pcbconnect+0x1c0
udp_usrreq(c16b3d94,9,c14df900,c2130800,0) at netbsd:udp_usrreq+0x2d1
sosend(c16b3d94,c2130800,e4e7ee80,c14df900,0) at netbsd:sosend+0x4e8
sendit(e94b1874,9,e4e7ef24,0,e4e7ef78) at netbsd:sendit+0x169
sys_sendmsg(e4e33584,e4e7ef80,e4e7ef78,c0309c38,0) at netbsd:sys_sendmsg+0xb3
---Type <return> to continue, or q <return> to quit---
syscall_plain(2b,2b,2b,2b,fffffffc) at netbsd:syscall_plain+0xab
End traceback...

dumping to dev 0,1 offset 391

Information from debugger:
(gdb) up
#13 0xc010adc4 in calltrap ()
(gdb) up
#14 0xc0127071 in udp_usrreq (so=0xc16b3d94, req=9, m=0xc14df900, 
    nam=0xc2130800, control=0x0, p=0xe94b1874)
    at /fs/IC35L120AVV207-0-e/src/NetBSD/netbsd/sys/netinet/udp_usrreq.c:1042
1042                            error = in_pcbconnect(inp, nam);
(gdb) list
1037                            laddr = inp->inp_laddr;         /* XXX */
1038                            if ((so->so_state & SS_ISCONNECTED) != 0) {
1039                                    error = EISCONN;
1040                                    goto die;
1041                            }
1042                            error = in_pcbconnect(inp, nam);
1043                            if (error)
1044                                    goto die;
1045                    } else {
1046                            if ((so->so_state & SS_ISCONNECTED) == 0) {
(gdb) print inp
$2 = (struct inpcb *) 0x5
(gdb) print nam
$3 = (struct mbuf *) 0xc2130800
(gdb) print *nam
$4 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, 
    mh_data = 0xc2130820 "\020\002", mh_owner = 0x0, mh_len = 16, 
    mh_flags = 0, mh_paddr = 703522816, mh_type = 3}, M_dat = {MH = {
      MH_pkthdr = {rcvif = 0x35000210, tags = {slh_first = 0xcc40ded4}, 
        len = 0, csum_flags = 0, csum_data = 0}, MH_dat = {MH_ext = {
          ext_buf = 0x0, ext_free = 0, ext_arg = 0x0, ext_size = 0, 
          ext_type = 0x0, ext_nextref = 0x0, ext_prevref = 0x0, ext_un = {
            extun_paddr = 0, extun_pgs = {0x0 <repeats 14 times>, 0xffdf0000, 
              0x3030074, can not access 0x40, invalid translation (invalid PTE)
can not access 0x40, invalid translation (invalid PTE)
can not access 0x40, invalid translation (invalid PTE)
can not access 0x40, invalid translation (invalid PTE)
can not access 0x40, invalid translation (invalid PTE)
can not access 0x40, invalid translation (invalid PTE)
0x0}}, ext_ofile = 0x40 <Address 0x40 out of bounds>, 
can not access 0x7, invalid translation (invalid PTE)
can not access 0x7, invalid translation (invalid PTE)
can not access 0x7, invalid translation (invalid PTE)
can not access 0x7, invalid translation (invalid PTE)
can not access 0x7, invalid translation (invalid PTE)
can not access 0x7, invalid translation (invalid PTE)
          ext_nfile = 0x7 <Address 0x7 out of bounds>, ext_oline = 206, 
          ext_nline = 1043346236}, 
        MH_databuf = '\000' <repeats 86 times>, "t\000\003\003\000\000\000\000@\000\000\000\a\000\000\000\000\000\000<30>Jun 22 08:20:29 named[12696]: zone 0.8.e.f.ip6.int/IN: sending notifies (serial 2003052801)"}}, 
    M_databuf = "\020\002\0005@", '\000' <repeats 98 times>, "t\000\003\003\000\000\000\000@\000\000\000\a\000\000\000\000\000\000<30>Jun 22 08:20:29 named[12696]: zone 0.8.e.f.ip6.int/IN: sending notifies (serial 2003052801)"}}

inp does not look too promising...

I am running pppoe and pppoe is regularly forcibly disconnected by the provider to insure i get my share of new 8-) IP addresses...


>How-To-Repeat:
	run a recent -current and probably use pppoe with a provider doing forced disconnects...

>Fix:
	no idea for workaround
	last significant change in udp_userrequest was to 1.100:
Change the way multicasts are kept.  They now use a hash table in the same
manner as the ifaddr hash table.  By doing this, the mkludge code can go
away.  At the same time, keep track of what pcbs are using what ifaddr and
when an address is deleted from an interface, notify/abort all sockets
that have that address as a source.  Switch IGMP and multicasts to use pools
for allocation.  Fix a number of potential problems in the igmp code where
allocation failures could cause a trap/panic.

	maybe something is lurking in there...
>Release-Note:
>Audit-Trail:
>Unformatted: