Subject: kern/21189: ifconfig + stf0 make kernel crash with uvm fault
To: None <>
From: None <>
List: netbsd-bugs
Date: 04/15/2003 07:12:14
>Number:         21189
>Category:       kern
>Synopsis:       ifconfig + stf0 make kernel crash with uvm fault
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Apr 15 07:13:00 UTC 2003
>Originator:     Alexander Grigo
>Release:        NetBSD 1.6Q (current)
NetBSD yoali 1.6Q NetBSD 1.6Q (YOALI.gdb) #0: Mon Apr 14 17:48:26 CEST 2003  root@n406.home:/userland/NetBSD-checkout/obj/userland/NetBSD-checkout/src/sys/arch/i386/compile/YOALI.gdb i386

When playing with stf0 and ifconfig I got uvm faults at least
in 1.6O and 1.6Q of current (i386). Userland is current as of
20030213 (not really sure) and I used kernels from the same
date (1.6O) and the one from last week's checkout (1.6Q).

My dmesg output is

NetBSD 1.6Q (YOALI.gdb) #0: Mon Apr 14 17:48:26 CEST 2003
total memory = 36480 KB
avail memory = 31228 KB
using 481 buffers containing 1924 KB of memory
mainbus0 (root)
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel 486DX2 (486-class), id 0x435
cpu0: features 3<FPU,VME>
isa0 at mainbus0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: kgdb
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 channel 0 drive 0: <Maxtor 71260 AT>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 1204 MB, 2448 cyl, 16 head, 63 sec, 512 bytes/sect x 2467614 sectors
wd0: drive supports PIO mode 3, DMA mode 0
vga0 at isa0 port 0x3b0-0x3df iomem 0xa0000-0xbffff
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation), using wskbd0
wsmux1: connecting to wsdisplay0
lpt0 at isa0 port 0x378-0x37b irq 7
pcppi0 at isa0 port 0x61
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: read port 0x203
ep0 at isapnp0 port 0x210/16 irq 5
ep0: 3Com 3C509B EtherLink III 
ep0: address 00:a0:24:7a:63:75, 8KB byte-wide FIFO, 5:3 Rx:Tx split
ep0: 10baseT, 10base5, 10base2 (default 10baseT)
IPsec: Initialized Security Association Processing.
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)

I compiled the kernel with debug and kgdb and used remote
debugging to get this (cont is to get the kernel running
after establishing the connection)
The print commands show the unaccessible memory address
at the position that caused the SIGSEGV.

(gdb) cont
Program received signal SIGSEGV, Segmentation fault.
in6ifa_ifpforlinklocal (ifp=0xc04ccc00, ignoreflags=7) at /userland/NetBSD-checkout/src/sys/netinet6/in6.c:1996

(gdb) p ifa
$1 = (struct ifaddr *) 0x5f666c65
(gdb) p ifa->ifa_addr
Cannot access memory at address 0x5f666c65

(gdb) where
#0  in6ifa_ifpforlinklocal (ifp=0xc04ccc00, ignoreflags=7) at /userland/NetBSD-checkout/src/sys/netinet6/in6.c:1996
#1  0xc015ab51 in nd6_prefix_onlink (pr=0xc046f880) at /userland/NetBSD-checkout/src/sys/netinet6/nd6_rtr.c:1611
#2  0xc015ae14 in nd6_prefix_offlink (pr=0xc04b5b00) at /userland/NetBSD-checkout/src/sys/netinet6/nd6_rtr.c:1741
#3  0xc015a37d in prelist_remove (pr=0xc04b5b00) at /userland/NetBSD-checkout/src/sys/netinet6/nd6_rtr.c:1096
#4  0xc01553ec in nd6_purge (ifp=0xc04cc000) at /userland/NetBSD-checkout/src/sys/netinet6/nd6.c:606
#5  0xc014b225 in in6_ifdetach (ifp=0xc04cc000) at /userland/NetBSD-checkout/src/sys/netinet6/in6_ifattach.c:672
#6  0xc01490e6 in in6_purgeif (ifp=0xc04cc000) at /userland/NetBSD-checkout/src/sys/netinet6/in6.c:1393
#7  0xc015d447 in udp6_usrreq (so=0xc4e25cb4, req=22, m=0x0, addr6=0x0, control=0xc04cc000, p=0xc4d85824) at /userland/NetBSD-checkout/src/sys/netinet6/udp6_usrreq.c:296
#8  0xc02599ec in if_detach (ifp=0xc04cc000) at /userland/NetBSD-checkout/src/sys/net/if.c:609
#9  0xc0266041 in stf_clone_destroy (ifp=0xc04cc000) at /userland/NetBSD-checkout/src/sys/net/if_stf.c:240
#10 0xc0259d11 in if_clone_destroy (name=0xc4e25ec0 "stf0") at /userland/NetBSD-checkout/src/sys/net/if.c:789
#11 0xc025a601 in ifioctl (so=0xc04af264, cmd=2149607801, data=0xc4e25ec0 "stf0", p=0xc4d85824) at /userland/NetBSD-checkout/src/sys/net/if.c:1339
#12 0xc023136c in soo_ioctl (fp=0xc4d19508, cmd=2149607801, data=0xc4e25ec0, p=0xc4d85824) at /userland/NetBSD-checkout/src/sys/kern/sys_socket.c:141
#13 0xc022e8ac in sys_ioctl (l=0xc4cfe700, v=0xc4e25f80, retval=0xc4e25f78) at /userland/NetBSD-checkout/src/sys/kern/sys_generic.c:640
#14 0xc029a4ff in syscall_plain (frame={tf_gs = 31, tf_fs = 31, tf_es = 31, tf_ds = 31, tf_edi = 134539662, tf_esi = -1077937840, tf_ebp = -1077938088, tf_ebx = 134538472, tf_edx = 0, tf_ecx = 134551920, tf_eax = 54, tf_trapno = 3, tf_err = 2, tf_eip = 1208972479, tf_cs = 23, tf_eflags = 647, tf_esp = -1077938132, tf_ss = 31, tf_vm86_es = 0, tf_vm86_ds = 0, tf_vm86_fs = 0, tf_vm86_gs = 0}) at /userland/NetBSD-checkout/src/sys/arch/i386/i386/syscall.c:156
#15 0xc0100a57 in syscall1 ()
#16 0x8049bd3 in ?? ()
#17 0x8049334 in ?? ()

Well, at first I was quite sure about how to repeat this
(see my posting from Apr.30 to current-users), but after
some further investigation I found out that one needs to
have some more things playing together.

Firstly, I had to set up my nic properly (ep0), e.g.
assign an ip address and set routes. Also I had to send
some packets over this interface (I used ping), but
sometimes it was not necessary, although I'm not quite
sure about this...

Secondly, I ran this little script

while [ "" = "" ] 
	echo hallo $n

	echo create...
	ifconfig stf0 create
	echo set addr...
	ifconfig stf0 inet6 2002:1234:1234::1
	echo destroy...
	ifconfig stf0 destroy

	sleep 1
	n=$(($n + 1))

Normally this causes an uvm fault after the
second iteration of the loop.
I found out, that it seems to be necessary to
assign an ip address to stf0.
In case the scripts runs several times without
crashing the kernel, just kill it do some network
traffic (I used ping). Then start the script again
and see what happens.
My experience is, that at most I had to run the
script for the second time to get the crash.

Although I'm not sure whether this is related to
some hardware problems, I should mention that
my nic (ep0) is detected twice (as ep0 and ep1)
when booting with GENERIC. And (of course ??) only
one works (ep1) while the other seems to hang until
a timeout when using ifconfig. Therefore I used my
own kernel config without most of the unused stuff
and this way it detects my nic as ep0 and only as ep0.

For all this testing and debugging I used a fresh
install. Except from my custom kernel everything else
is using the default settings (ok, I added a user
and ifconfig.ep0)
Since I'm not an experienced kernel hacker I
can't do too much to fix this uvm fault.
But if you need some more debugging output
just ask (but be specific about what you
want to know ;) Also I could sent you the
kernel core file (!36MB ).