Subject: Callout table troubles, maybe? Panic, yes.
To: None <tech-kern@netbsd.org>
From: Donald Lee <donlee_ntk@icompute.com>
List: tech-kern
Date: 02/21/2001 22:49:36
This is re-posted to tech-kern at the suggestion of Bill Studenmund....

It appears to be port-macppc specific, but I'm no authority.
I'm stuck largely because I don't know much about NetBSD internals,
don't know how to use the kernel debugger, and the traceback (stack
crawl) is busted in netbsd-macppc, so when I get the crashes, I can't even tell
where I've been....

I am willing to help chase this if someone can help me get useful information.

Please mail to me directly, as I don't subscribe (yet ;-> ) to the
tech-kern list.

-dgl-
--------------
Dear list,

I'm a little stuck right now.

I've been trying to find a problem with the Cyclades 8Y (multi-port
serial) driver, and I thought I was chasing a problem with the usage of
the callout mechanism in the kernel.  What I've found though, is that
the panics I am generating happen with or without the Cyclades card.

To get the system to panic, I have to stress the ethernet and/or networking.

My technique is to get two ftp streams going to another NetBSD machine
on my network.  Adding a SCSI copy of a large file seems to speed
the crash.

When I have the Cyclades card in the machine, it crashes more quickly.
With the Cyclades, it takes no more than 15 seconds of stress.  Without
the card, it takes a minute or two.

>Feb 19 20:45:21 charm savecore: no core dump
>Feb 19 20:58:40 charm /netbsd: mc0: receive FIFO overflow
>Feb 19 20:58:40 charm /netbsd: short packet len=2
>Feb 19 21:02:08 charm syslogd: restart
>Feb 19 21:02:09 charm /netbsd: panic: m_copydata
>Feb 19 21:02:09 charm /netbsd: syncing disks... 6 6 6 6 6 6 5 5 5 5 5 5 5 5 5 5
>5 5 5 5 4 4 giving up
>Feb 19 21:02:09 charm /netbsd: dumpsys: TBD
>Feb 19 21:02:09 charm /netbsd: sd0: cache synchronization failed
>Feb 19 21:02:09 charm /netbsd: rebooting

I get two different crashes.  One is a "panic: receive 1", which is
in:

>/*      $NetBSD: uipc_socket.c,v 1.50 2000/03/30 09:27:14 augustss Exp $
> ...
>#ifdef DIAGNOSTIC
>                if (m == 0 && so->so_rcv.sb_cc)
>                        panic("receive 1");
>#endif
> ... 

The other is the "panic: m_copydata", which I've not really chased down.

My machine:

>PMac 7600 w/ 604 CPU (rev 303). 132 Mhz
>NetBSD 1.5, with mods to cyclades and in extintr.c to fix ppp problems.
>dev/ic/cy.c instrumented and modified to provide debug info.
>Otherwise, pretty stock.
>Using built-in ethernet (mc0)
>Booting from SCSI (4 Gb Seagate, ID 0)

I believe that the problem is with the low-level interrupt handlers in
the kernel, though that's just a guess.

I cannot generate these crashes with the standard 1.5 kernel.  I believe that
the change in my kernel to extintr.c to ensure
that soft interrupts get handled promptly is causing both the crashes,
and a speed up in my ftp transfers.  (the ftp streams with my "crash
test" and a standard kernel only run about 300-400 Kb/s.
With the modified kernel, it runs very close to 1 Mb/s.)

It is noteworthy that the Cyclades driver makes heavy use of the callout
mechanism in the kernel.  It fires a handler on every tick.
(callout_reset(&callent, 1, func, NULL);)

With a little guidance, I may even be able to track this down, but I don't
know enough about the innards of NetBSD's kernel to be very helpful.  All
I can do at this point is grope around in the dark.

Help?

Thanks in advance,

-dgl-

(logfile from culprit machine)

Feb 19 20:25:55 charm /netbsd: NetBSD 1.5 (try7) #145: Mon Feb 19 19:49:55 CST 2
001
Feb 19 20:25:55 charm /netbsd:     donlee@charm:/usr/src/sys/arch/macppc/compile
/try7
Feb 19 20:25:55 charm /netbsd: CPU: 604 (Revision 303)
Feb 19 20:25:55 charm /netbsd: total memory = 32768 KB
Feb 19 20:25:55 charm /netbsd: avail memory = 25640 KB
Feb 19 20:25:55 charm /netbsd: using 435 buffers containing 1740 KB of memory
Feb 19 20:25:56 charm /netbsd: mainbus0 (root)
Feb 19 20:25:56 charm /netbsd: cpu0 at mainbus0bandit0 at mainbus0
Feb 19 20:25:56 charm /netbsd: pci0 at bandit0 bus 0
Feb 19 20:25:56 charm /netbsd: pci0: i/o space, memory space enabled
Feb 19 20:25:56 charm /netbsd: pchb0 at pci0 dev 11 function 0
Feb 19 20:25:56 charm /netbsd: pchb0: Apple Computer Bandit Host-PCI Bridge (rev
. 0x03)
Feb 19 20:25:56 charm /netbsd: cy: Found Cyclades PCI device, id = 0x105120e
Feb 19 20:25:56 charm /netbsd: cy0 at pci0 dev 15 function 0cy: card reset done
Feb 19 20:25:56 charm /netbsd: cy0 probe chip 0 offset 0x0 ... cy0 firmware vers
ion 0x48
Feb 19 20:25:56 charm /netbsd: cy0 probe chip 1 offset 0x800 ... cy0 firmware ve
rsion 0x48
Feb 19 20:25:56 charm /netbsd: cy0 probe chip 2 offset 0x1000 ... not ready for
command
Feb 19 20:25:56 charm /netbsd: cy0 found 2 CD1400s
Feb 19 20:25:56 charm /netbsd: : interrupting at irq 25
Feb 19 20:25:56 charm /netbsd: cy0attach CD1400 #0 offset 0x0
Feb 19 20:25:56 charm /netbsd: attach CD1400 #1 offset 0x800
Feb 19 20:25:56 charm /netbsd: : 8 ports
Feb 19 20:25:56 charm /netbsd: obio0 at pci0 dev 16 function 0: addr 0xf3000000
Feb 19 20:25:56 charm /netbsd: esp0 at obio0 offset 0x10000 irq 12: NCR53C94, 25
MHz, SCSI ID 7
Feb 19 20:25:56 charm /netbsd: scsibus0 at esp0: 8 targets, 8 luns per target
Feb 19 20:25:56 charm /netbsd: mc0 at obio0 offset 0x11000: irq 14,2,3: address
00:05:02:d8:41:6c
Feb 19 20:25:56 charm /netbsd: zsc0 at obio0 offset 0x13000: irq 15,16
Feb 19 20:25:56 charm /netbsd: zstty0 at zsc0 channel 0
Feb 19 20:25:56 charm /netbsd: zstty1 at zsc0 channel 1
Feb 19 20:25:56 charm /netbsd: awacs at obio0 offset 0x14000 not configured
Feb 19 20:25:56 charm /netbsd: swim3 at obio0 offset 0x15000 not configured
Feb 19 20:25:56 charm /netbsd: adb0 at obio0 offset 0x16000 irq 18: 1 targets
Feb 19 20:25:57 charm /netbsd: aed0 at adb0 addr 0: ADB Event device
Feb 19 20:25:57 charm /netbsd: akbd0 at adb0 addr 2: extended keyboard
Feb 19 20:25:57 charm /netbsd: wskbd0 at akbd0: console keyboard
Feb 19 20:25:57 charm /netbsd: mesh0 at obio0 offset 0x18000 irq 13: 50MHz, SCSI
 ID 7
Feb 19 20:25:57 charm /netbsd: scsibus1 at mesh0: 8 targets, 8 luns per target
Feb 19 20:25:57 charm /netbsd: nvram0 at obio0 offset 0x1d000
Feb 19 20:25:57 charm /netbsd: bandit1 at mainbus0
Feb 19 20:25:57 charm /netbsd: pci1 at bandit1 bus 1
Feb 19 20:25:57 charm /netbsd: pci1: i/o space, memory space enabled
Feb 19 20:25:57 charm /netbsd: ofb0 at pci1 dev 11 function 0: Apple Computer Co
ntrol
Feb 19 20:25:57 charm /netbsd: ofb0: 640 x 480, 8bpp
Feb 19 20:25:57 charm /netbsd: wsdisplay0 at ofb0: console (std, vt100 emulation
), using wskbd0
Feb 19 20:25:57 charm /netbsd: Apple Computer PlanB (undefined subclass 0x00, re
vision 0x01) at pci1 dev 13 function 0 not configured
Feb 19 20:25:57 charm /netbsd: scsibus0: waiting 2 seconds for devices to settle
...
Feb 19 20:25:57 charm /netbsd: scsibus1: waiting 2 seconds for devices to settle
...
Feb 19 20:25:57 charm /netbsd: sd0 at scsibus1 target 0 lun 0: <SEAGATE, ST34520
N, 1444> SCSI2 0/direct fixed
Feb 19 20:25:57 charm /netbsd: sd0: 4340 MB, 9006 cyl, 4 head, 246 sec, 512 byte
s/sect x 8888924 sectors
Feb 19 20:25:57 charm /netbsd: boot device: sd0
Feb 19 20:25:57 charm /netbsd: root on sd0a dumps on sd0b
Feb 19 20:25:57 charm /netbsd: root file system type: ffs
Feb 19 20:25:56 charm savecore: no core dump