Subject: Race condition in kernel? (was ping times)
To: None <port-macppc@netbsd.org>
From: Donald Lee <MacPPC@caution.icompute.com>
List: port-macppc
Date: 07/15/2004 21:41:48
I have not found the problem, but I have cornered it.  Can someone
who knows the workings of the bowels of the kernel tell me how
I might proceed?

I posted about ping times of 10 ms, but found that the problem was only
seen on my machine with a Sonnet 1 Ghz upgrade CPU. (in a G4/AGP,
originally a 450 Mhz CPU) (see thread "NetBSD 1.6.2 ping times again")

I then re-installed the original CPU.

Now the problem is gone.

I conclude that I have one of two problems in the kernel.

	1. There is a race condition that surfaces when the CPU is
	"fast enough".  I don't have enough CPUs of different speeds
	to test this theory.

	2. There is something about the cache, or the specific model
	of CPU that causes events to be "lost".

If it's a race condition, it's a problem lurking in the weeds for us.
If it's a cache thing, then it'll likely end up biting other people
with upgrade cards.

Below is the log excerpt for the two different CPUs.  One is the orig
(450 Mhz) which seems to work correctly.  The other is the 1Ghz
upgrade (L3 cache, different G4 type), which exhibits the problem.

When I was poking around in the kernel, it looked to me like the
variable "astpending" was a possible culprit for this sort of
thing, because references were surrounded by "isync" type instructions
and it appeared to be *the* mechanism to signal resched() events.

Ideas, anyone?  Is there a way to disable the L3 cache?

Tell me where to go. ;->

Thanks,

-dgl-


OLD (working) CPU:

Jul 15 21:22:20 grace syslogd: restart 
Jul 15 21:22:20 grace /netbsd: NetBSD 1.6.2 (try6) #3: Wed Jul  7 19:33:18 CDT 2004 
Jul 15 21:22:20 grace /netbsd:     donlee@grace:/usr/src.162.mod/sys/arch/macppc/compile/try6
Jul 15 21:22:20 grace /netbsd: total memory = 896 MB
Jul 15 21:22:20 grace /netbsd: avail memory = 813 MB
Jul 15 21:22:20 grace /netbsd: using 2048 buffers containing 45976 KB of memory
Jul 15 21:22:20 grace /netbsd: mainbus0 (root)
Jul 15 21:22:20 grace /netbsd: cpu0 at mainbus0: 7400 (Revision 2.6), ID 0 (primary)
Jul 15 21:22:20 grace /netbsd: cpu0: HID0 8094c2a4<EMCP,DOZE,DPM,EIEC,ICE,DCE,SPD,SGE,BTIC,BHT> 
Jul 15 21:22:20 grace /netbsd: cpu0: 450.00 MHz
Jul 15 21:22:20 grace /netbsd: cpu0: 1MB backside cache
Jul 15 21:22:20 grace /netbsd: uninorth0 at mainbus0
Jul 15 21:22:20 grace /netbsd: pci0 at uninorth0 bus 0
Jul 15 21:22:20 grace /netbsd: pci0: i/o space, memory space enabled



NEW (failing) CPU:

Jul  7 19:35:17 grace /netbsd: NetBSD 1.6.2 (try6) #3: Wed Jul  7 19:33:18 CDT 2004
Jul  7 19:35:17 grace /netbsd:     donlee@grace:/usr/src.162.mod/sys/arch/macppc/compile/try6
Jul  7 19:35:17 grace /netbsd: total memory = 896 MB
Jul  7 19:35:17 grace /netbsd: avail memory = 813 MB
Jul  7 19:35:17 grace /netbsd: using 2048 buffers containing 45976 KB of memory
Jul  7 19:35:17 grace /netbsd: mainbus0 (root)
Jul  7 19:35:17 grace /netbsd: cpu0 at mainbus0: 7455 (Revision 2.1), ID 0 (primary)
Jul  7 19:35:17 grace /netbsd: cpu0: HID0 8450c0bc<EMCP,TBEN,NAP,DPM,ICE,DCE,SGE,BTIC,LRSTK,FOLD,BHT>
Jul  7 19:35:17 grace /netbsd: cpu0: 1000.00 MHz
Jul  7 19:35:17 grace /netbsd: cpu0: 256KB L2 cache, 2MB L3 backside cache
Jul  7 19:35:17 grace /netbsd: uninorth0 at mainbus0