Subject: kern/1321: ppp tried to use some dead beef
To: None <gnats-bugs@gnats.netbsd.org>
From: John Kohl <jtk@kolvir.arlington.ma.us>
List: netbsd-bugs
Date: 08/08/1995 00:20:23
>Number:         1321
>Category:       kern
>Synopsis:       ppp tried using a deallocated mbuf
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Aug  8 00:50:01 1995
>Last-Modified:
>Originator:     John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release:        NetBSD-current as of August 4th or so
>Environment:
	
System: NetBSD kolvir 1.0A NetBSD 1.0A (KOLVIR) #623: Thu Aug 3 22:47:12 EDT 1995 jtk@pattern:/u1/NetBSD-current/src/sys/arch/i386/compile/KOLVIR i386

My kernel has an added line "imask[IPL_NET] |= imask[IPL_TTY]" in
intr_calculatemasks(), but that didn't seem to help here.

>Description:
Just after heavy I/O load (NFS over ethernet, plus local disk activity),
ppp tried to use an mbuf pointer that was "0xdeadbeef"--somebody had
apparently freed the mbuf from which ppp tried to fetch a pointer.

Here's the stack trace and register dump.  It is apparently inside
pppgetm() at the "len -= M_DATASIZE(m)" line.  It dies trying to fetch
the flags out of the mbuf m.  It was called from pppinput() at the point
where it calls pppgetm() and then splx(s) and returns (preparing for the
next packet, apparently?)

db> tr
_pppstart(f81e9a90,f81e9a90,f87f5c00,0) at _pppstart+0x5de
_pppinput(7e,f873e700) at _pppinput+0x226
_compoll(0) at _compoll+0x1c8
_softclock(f86f6c00,f8764e00,0,f9a48db8,f81a92e3) at _softclock+0x51
_hardclock(f9a48dc4,f9a48dc0,f8100f6d,f9a48dc4,0) at _hardclock+0x1a8
_clockintr(f9a48dc4) at _clockintr+0xb
_Xintr0() at _Xintr0+0x5d
--- interrupt ---
_idle(f8cf6508,0,f87bcb38,0,f9a48e90) at _idle+0xd
bpendtsleep(f8cf6508,4,f8188ca5,0) at bpendtsleep
_lock_clear_recursive(f87c5720,f9a48efc,1,100000,1) at _lock_clear_recursive+0x
9af
_lock_clear_recursive(f87c5740,f9a48efc,1,1,f82ca7a0) at _lock_clear_recursive+
0x480
_vm_pager_get_pages(f87c5740,f9a48efc,1,1,f9a48f4c) at _vm_pager_get_pages+0x4a

_vm_pager_get(f87c5740,f82ca7a0,1) at _vm_pager_get+0x14
_vm_fault(f8765100,1006c000,3,0) at _vm_fault+0x222
_trap() at _trap+0x4c1
--- trap (number 6) ---
0x10044862:
db> 
db> show reg
es                0x10
ds                0x10
edi         0xf81eaa7c  _ppp_softc+0xfec
esi              0x5e2
ebp         0xf9a48d30  _end+0x1835ff0
ebx         0xdeadbeef
edx         0xf81e9a90  _ppp_softc
ecx               0x1a
eax              0x5dc
eip         0xf814ae82  _pppstart+0x5de
cs                 0x8
eflags         0x10282
esp         0xf9a48d20  _end+0x1835fe0
ss                0x10
_pppstart+0x5de:        testb   $0x1,0x12(%ebx)
db> 

Unfortunately, I don't have the mbuf pointer sc->sc_m handy, but I
suspect this bug is a run-of-the-mill early free of an mbuf before it
gets finished with references.  Maybe there's some interlock missing
between various flavors of networking and memory allocation?

>How-To-Repeat:
run ppp (receiving) and ethernet and disk I/O and swapping all at the
same time.  You'll probably get a crash sooner or later.

>Fix:
	???

>Audit-Trail:
>Unformatted: