Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Boot NetBSD9.1 on a SUN Blade 100



Ok, so I booted both kernels and guess what: no change in behaviour.

Then with netbsd.gdb booting, I interrupted them at two different points in time, before the line

[  63.9893876] admtemp0: workqueue busy: updates stopped

and after that, because there is a long break between the detection of the USB-hub and the admtemp0-message (might well be the indicated 60sec)

Backtrace from time before admtemp0-Timeout:

[   1.1200107] uhub0 at usb0: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
Stopped in pid 0.5 (system) at  netbsd:cpu_Debugger+0x4:        nop
db{0}> bt
intr_list_handler(101dbc880, 7, e00479b0, 0, 1044060, 2) at netbsd:intr_list_handler+0x10 sparc_interrupt(101dbcc40, 1, e0047b90, 0, 1044020, e0048000) at netbsd:sparc_interrupt+0x294 sparc_interrupt(1c70240, 70000000001, 2014000, 101db6ca0, 1, 1c72000) at netbsd:sparc_interrupt+0x294 frag6_slowtimo(1c95730, 101db6ca0, 1c70000, 1cc5000, 1ca3650, 101db6ca4) at netbsd:frag6_slowtimo+0x24 pfslowtimo(0, 101d88041, 0, 1cba478, 1c3c938, 18192b8) at netbsd:pfslowtimo+0x40 callout_softclock(1cba480, 1000000, 10000, 30c0, 20c0, 1cba520) at netbsd:callout_softclock+0xc8 softint_dispatch(2000, 1, 0, 101db6ca0, 1779500c0, 177950360) at netbsd:softint_dispatch+0x80 softint_fastintr(101db6ca0, 1, e0047cf0, 0, 1044020, e0048000) at netbsd:softint_fastintr+0x80 sparc_interrupt(f0056c1c, 1140d0, 1173b8, 0, fff57b48, 1) at netbsd:sparc_interrupt+0x294

Backtrace from time after admtemp0-Timeout:

[   1.1200103] uhub0 at usb0: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[  63.5593913] admtemp0: workqueue busy: updates stopped
Stopped in pid 0.5 (system) at  netbsd:cpu_Debugger+0x4:        nop
db{0}> bt
intr_list_handler(101dbc880, 1, e0047b90, 101db6ca0, 1044060, e0048000) at netbsd:intr_list_handler+0x10 sparc_interrupt(101d88040, 101d88041, 101db6ca0, 1cc5000, 1ca3400, 1cc7400) at netbsd:sparc_interrupt+0x294 callout_schedule_locked(101d88040, 101d88040, 1a80, 1cba478, 1cba478, 101d88040) at netbsd:callout_schedule_locked+0x94 callout_softclock(1cba480, 1000000, 10000, 30c0, 20c0, 1cba520) at netbsd:callout_softclock+0x274 softint_dispatch(2000, 1, 0, 101db6ca0, 1779500c0, 177950360) at netbsd:softint_dispatch+0x80 softint_fastintr(101db6ca0, 1, e0047cf0, 0, 1044020, e0048000) at netbsd:softint_fastintr+0x80 sparc_interrupt(f0056c1c, 1140d0, 1173b8, 0, fff57b48, 1) at netbsd:sparc_interrupt+0x294
db{0}>

Of course, the actual function that shows up in the bt at the time of interrupt (BREAK) seems arbitrary, as I had different results in different runs. I am not sure, how to read the backtrace, whether the top-most-line is the address of the last return address on the stack, so that you would go from top to bottom to learn the calling sequence... or is it vice versa. And does the interupt from the BREAK show up in the backtrace??

However, something striking is that callout_softclock is always involved...


Am 17.06.21 um 13:20 schrieb Julian Coleman:
Hi,

hmm, strange. Nothing attached to the Firewire-Ports. The code looks like that
watchdog_clock never gets reset? Is there a way to disable the FW-ports  via a
boot.conf or something?
Unfortunately, we need to remove it from the kernel.

I downloaded the source for the 9.2-kernel and there seems to be a mismatch
in the versions of firewire.c on the website you mentioned and what I have
found in the kernel tree.
The version in 9.2 seems to be 1.48 and the version on the website is 1.51.
Actually, line 1323 in version 1.48 is in a different function...
Which kernel is generated by the sources from the website?
nxr.netbsd.org has the current sources.  The 9.x kernels are on a different
branch, which is why you see the different versions.  There is also the
CVS history for the file at:

   http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/ieee1394/firewire.c

Looking at that code though, it hasn't changed for a long time.  So, I was
wondering if something other change is causing that problem.  It tempting
to try to increase the multiplier from 15 to something larger to see if we
just need to wait longer [1], or just to try a kernel without FW to really
check that it is the problem.  I've built a kernel from GENERIC without FW:

   http://ftp.netbsd.org/pub/NetBSD/misc/jdc/sparc64/netbsd

and if you are able to test boot that, then it would be useful [2].  There
is also the version with full debugging symbols [3]:

   http://ftp.netbsd.org/pub/NetBSD/misc/jdc/sparc64/netbsd.gdb

and the kernel configuration that I used:

   http://ftp.netbsd.org/pub/NetBSD/misc/jdc/sparc64/GENERIC-NOFW

Regards,

Julian

[1] Instead of waiting, we might just be able to check if we are running
with interrupts after start, like the check here:

   https://nxr.netbsd.org/xref/src/sys/arch/sparc/dev/ts102.c#1044

[2] my test netbsd-9 has a few local changes in some drivers, but nothing
that should affect this.

[3] The .gdb file is useful because we can match a backtrace to a source
line.  For example:

firewire_watchdog(101d8a040, 101d8a041, 0, 1cba038, 1cba038, 101d8a040) at
netbsd:firewire_watchdog+0x48

:; gdb netbsd.gdb
(gdb) list *(firewire_watchdog+0x48)
   ...



Home | Main Index | Thread Index | Old Index