tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Strange PCI bug



I'm experiencing a strange, but reproducible, bug in PCI.

I tried to install NetBSD 8 on two old x86 32bit machines. The hardware
is completely different, but they both have an SiS controller. At boot
time, they both crash in the first panic of fputrap(). I tried to boot
with a -current kernel, the problem is the same as in -8.

Important note: the machines are monocore.

The trace is:

	[somewhere in PCI] -> outl -> Xtrap10 -> fputrap -> panic

Having given a careful look at the place where the trap is received, I
am certain it is not caused by an actual FPU instruction trap.

On the two machines, the source of the trap is exactly the "outl"
instruction in the outl() function.

I was not able to locate exactly where this offending outl() is called
from. I know that it is called with:

	%eax = 0x80000010
	%edx = 0xcf8

The last thing the system displays is:

	siside0: Silicon Integrated Systems
	panic: ...

The code is in sys/dev/pci/siside.c. Narrowing it down: the panic happens
before the "pci_find_device" call finishes.

Therefore the trace should be:

	sis_chip_map -> pci_find_device -> [mystery] -> outl -> Xtrap10
		-> fputrap -> panic

I say "mystery", because when I try to add printfs, the initialization
of the other PCI devices generates a lot of output, and for some reason,
the systems then boot fine. In other words: if I introduce some delay
in the PCI code, the systems don't crash.

It is really strange. The fact that we receive exception 0x10 seems to
indicate that there is a problem somewhere related to the pin/irq
initialization, but I don't really see how the delay could fix that. Or
maybe the printfs do something more than just adding delay, I don't
really know.

Does that ring a bell to someone? Running out of ideas... Phew, I liked
these machines...


Home | Main Index | Thread Index | Old Index