Subject: Re: data corruption (Re: Binding more than one IP to a NIC)
To: Simon Burge <simonb@wasabisystems.com>
From: Markus W Kilbinger <kilbi@rad.rwth-aachen.de>
List: port-cobalt
Date: 11/10/2004 10:26:51
>>>>> "Simon" == Simon Burge <simonb@wasabisystems.com> writes:
Izumi> Well, AFAIK the problem on cobalt PCI implementation
Izumi> affects memory mapped PCI devices (like siop), so I'm not
Izumi> sure if it fixes your "data corruption" problem. (patch
Izumi> attached, including MI changes filed in kern/27423)
>>
>> No, the data corruption problem remains, but my qube2 became
>> much more stable with your pci related patches! Beside the data
>> corruption problems now I 'only' see panics under heavy load of
>> the following kind:
>>
>> trap: TLB miss (load or instr. fetch) in kernel mode
>> status=0x2403, cause=0x8808, epc=0x80229214, vaddr=0xc874e000
>> pid=196 cmd=ttcp usp=0x7fffda10 ksp=0xc8749b08
>> Stopped in pid 196.1 (ttcp) at netbsd:r5k_pdcache_wb_range_32+0x58: cache 0
>> x19,0x1a0(a0)
db> t
>> r5k_pdcache_wb_range_32+58 (c874de60,c874e3e0,5ea,5ea) ra 8022e468 sz 0
>> 8022e3d8+90 (c874de60,c874e3e0,5ea,5ea) ra 0 sz 0
>> User-level: pid 196.1
db>
Simon> I'm a little rusty on cache ops...
Thanks for joining the problem... :-)
Desperately I've built kind of 'maximum optimized' kernel (-march=r5k
-O6) which seems to improve things a lot (more inline-ing?). No panic
so far where the '-O2' kernel paniced formerly quite certainly...
Just the 'pmap_unwire: wiring for pmap ...' messages continue (see
http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=21587 ).
Simon> The vaddr shows this is the first cache op after a page
Simon> boundary. We have no TLB entry for that page and because
Simon> this is a hit op we take an exception.
Similar like PR 21587?
Simon> I'll have a think about this later on, but just wanted to
Simon> get some random mumblings out now.
Tell me if/when I can test something,
Markus.