Subject: Re: lint issues with larger caches
To: Riccardo Mottola <rollei@tiscalinet.it>
From: John Klos <john@ziaspace.com>
List: tech-toolchain
Date: 09/22/2005 01:22:06
>> Would anyone object to me committing the following change to
>> src/usr.bin/xlint?

> I would be against this. It is not a real solution and you say yourself
> "it happens with much less frequency".

"Less frequency" wasn't about the fix. It was about the 7455 with 256k of 
CPU speed L2 cache. With the -O0, it ALWAYS completes the source tree 
build (I've run it four times through already today, then tried it once 
without the -O0 just to be sure, then tried again with -O0, so I'm sure).

> Does your problem happen during a kernel build? I have a G4 with an
> external L2 cache of 1Mb and I can build a kernel... if it doesn't crash
> totally.

-current is completely unusable for me, if that's what you're talking 
about. In between lockups in -current, I saw the same problems. I was able 
to build kernels between lockups in both -current and release, though.

> I keep top think that netbsd has a problem with caches... we had a
> thread some weeks ago about problems I got when I upgraded to a g4/400
> cache card with 1Mb of cache. Currently I have a quite stable kernel
> since I put the cache in "WT" mode, which results in some performance
> loss though. People on this list said that varting compilation
> optimization levels of the kernel would make the problem go away. WHy I
> verified that various -O level had a different degree of crash frequency
> (up to not ebing able to complete a boot+login sometimes) it was just a
> question on how much the computer was stressed and sooner or later it
> would crash.

I don't think that that is a related issue. I have many kinds of G3s and 
G4s with both on-CPU and off-CPU L2, and some with L3, that are rock 
solid. If there were a problem in general with caches, many more people 
would be seeing issues. If I were you, I'd try very conservative settings 
with NetBSD-release and see if you get it to where it's completely stable, 
then move to less conservative settings until you get to the place where 
you think it should be.

But if you're running -current, all bets are off. It locks up regularly on 
hardware which is otherwise completely stable.

John Klos