Port-mips archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

re: SGI O2 hangs hard constantly



> The hangs are sometimes proceeded by the following console errors
> Sep 26 13:01:22 dribble /netbsd: crime: cpu error 4 at address 320046432
> Sep 26 13:06:12 dribble /netbsd: crime: cpu error 4 at address 320057696
>
> the address is different each time. These errors do not always precede a
> lock up but if i see them it will eventually lock up.
>
> I get any where from 40 mins to 7hrs of compile time, it appears to be
> random how long it will work for.

late reply, but i am trying to figure out this problem.

for me, it seems to happen when using the network only.
if i ensure everything is local, then i am unable to
trigger a hang.  i've also found that installing a PCI
re(4) and using it can trigger the hang in a few seconds.

it may have something to do with mips/bus_dma.c, as if
i enable logging printfs() around bus_dmamap_sync() that
are not too short, then hang does not occur (but the
network IO is slower than 1mbit now.)  since it happens
on re(4) and onboard mec(4) it seems possible the problem
is in bus_dmamap_load_mbuf() here, but so far there are
no clues to what is going on.

i believe the "error 4" means the CPU triggered an
"illegal address", but i do not yet know what this means
in the real world.  i have noticed that the address is 
always in the 0x1000_0000 - 0x1400_0000 range.  i may
try removing 2 dimms from my system, reducing it from
256MB to 192MB... maybe this problem occurs on R10K/
R12K systems where memory is above the first 256MB in
physical address?


.mrg.


Home | Main Index | Thread Index | Old Index