Subject: Re: Netbsd 3.0 Crash on file transfert
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
From: Thierry Rangeard <thierry.rangeard@gmail.com>
List: port-cobalt
Date: 03/17/2006 19:17:23
Thank's a lot for your quick and sharp analyse.


I think that your are in the right way and you have put your finger =20
on the right problem.
Because I can compile and make an heavy load on the current kernel =20
wihtout an problem.
The only thing that I could mentioned, but I don't know if it's =20
linked  the swap is never used or deacreasing during this load.

Does it happen with all kernel branch or it's specific to the 3.0 ?.

Sorry I would like to be more involved but I m not a C developer what =20=

exactly do your patch ?
I can also make the test with a 3 com PCI, I have to check if it's =20
supported by the kernel.

regards.
--
Thierry

Le 17 mars 06 =E0 18:44, Izumi Tsutsui a =E9crit :

> In article <33B47346-9A53-462D-9719-D4E39F13AFFD@gmail.com>
> thierry.rangeard@gmail.com wrote:
>
>> I tried nothing special, it was during a scp transfert with a large
>> 100MB file. As you could see in the stack trace the stopped process
>> was ssh.
>
> Yes, it could happen during large network xfers.
>
>> Trap: TLB miss (load.inst.fetch) in Kernelmode status=3D0x2403,
>> cause=3D0x8, epc=3D0x810cfde0, vaddr=3D0xcc708000 pid 911
>> cmd=3Dsshd usp=3Dox7fffd4f0 Ksp=3D0xcc6ebb08
>> stopped in pid 911.1 (ssh) at kernel:r5k_pdcache_wb_range_32+0x9c:
>> cache 0x19, 0x3c0(a0)
>
> This is in cobalt/cobalt/bus.c:_bus_dmamap_sync().
> When I replaced this mips_dcache_wb_range() with
> mips_dcache_wbinv_range() the panic still happened,
> so it seems that wrong VA passed from bus_dmamap_sync()
> on BUS_DMASYNC_PREWRITE (i.e. TX of tlp(4)) causes
> the problem, but I can't see why it happens only on
> TX, not RX (i.e. BUS_DMASYNC_PREREAD).
>
> On the other hand, I can't reproduce this problem
> on R5000 O2, so I wonder if it's Rm52xx specific or
> tlp(4) specific.
>
> Could anyone try if the similar panic could happen
> with other NIC on a PCI slot of Qube2?
>
>> Is there a sysctl parameter that I can use ?
>
> The attached patch could hide a panic (as workaround),
> but maybe I (or someone) should check Rm52xx manual more closely.
> (I'm afraid there are some more Rm52xx specific CP0 hazards
>  not handled currently)
> ---
> Index: arch/cobalt/cobalt/bus.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> RCS file: /cvsroot/src/sys/arch/cobalt/cobalt/bus.c,v
> retrieving revision 1.24
> diff -u -r1.24 bus.c
> --- arch/cobalt/cobalt/bus.c	1 Mar 2006 12:38:11 -0000	1.24
> +++ arch/cobalt/cobalt/bus.c	17 Mar 2006 17:40:58 -0000
> @@ -621,7 +621,11 @@
>  			break;
>
>  		case BUS_DMASYNC_PREWRITE:
> +#if 0	/* XXX */
>  			mips_dcache_wb_range(addr + offset, minlen);
> +#else
> +			mips_dcache_wbinv_range_index(addr + offset, =
minlen);
> +#endif
>  			break;
>  		}
>  #ifdef BUS_DMA_DEBUG
>
> ---
> Izumi Tsutsui