Cache corruption on OMAP? (was Re: linking on ARM fails)

To: port-arm%netbsd.org@localhost, tech-kern%netbsd.org@localhost
Subject: Cache corruption on OMAP? (was Re: linking on ARM fails)
From: Mikko Rapeli <mikko.rapeli%teleca.com@localhost>
Date: Mon, 8 Dec 2008 13:05:24 +0200

This linking problem[1] might be cache corruption since a kernel with data
cache disabled[2] managed to compile a booting kernel during last weekend. 
Without data cache the OMAP2420 is really slow so perhaps some other timing
related bug does not get triggered.

The test case is basically: compile some big project like NetBSD
kernel or TET (http://wiki.netbsd.se/How_to_run_TET_framework).

Linking TET failed on proctc.o being corrupt and not recognized by
linker. Manual proctc.c recompilations produced similar results until I
removed -O from the cc invocation. After that linking succeeds and the tcc
executable seems to run ok. I have tried to produce a simple test case
but have failed so far. After TET or kernel compilation has failed a
simple 'cat largefile > perfect_copy' produces a corrupt perfect_copy
which was easily verified on the NFS server side with md5sum. But from
a freshly booted system I could not get a corrupt cat file with
various scripts consuming network bandwidth, CPU time and memory on the
background.

Since current on OMAP has been broken so long on I can't just try
bisecting for commits which might have caused this. Are there some other
kernel debugging tricks which might help finding out what is causing this
corruption?

I have been working with current from Dec 2nd with the stability increasing 
interrupt fix http://mail-index.netbsd.org/port-arm/2008/11/27/msg000466.html 
applied.

-Mikko

[1] http://mail-index.netbsd.org/port-arm/2008/11/26/msg000464.html

[2] Disable data cache for OMAP2420:

--- a/sys/arch/arm/arm/cpufunc.c
+++ b/sys/arch/arm/arm/cpufunc.c
@@ -2487,7 +2487,9 @@ arm1136_setup(char *args)

        cpuctrl =
                CPU_CONTROL_MMU_ENABLE  |
-               CPU_CONTROL_DC_ENABLE   |
+               /* disable data cache for testing if it's corrupting files
+                * CPU_CONTROL_DC_ENABLE   |
+                */
                CPU_CONTROL_WBUF_ENABLE |
                CPU_CONTROL_32BP_ENABLE |
                CPU_CONTROL_32BD_ENABLE |

Prev by Date: Re: ffs_balloc_ufs1 error handling
Next by Date: Re: ffs_balloc_ufs1 error handling
Previous by Thread: floppy detection hang with NetBSD 5 beta
Next by Thread: atan2 and sqrt --> in driver
Indexes:

Home | Main Index | Thread Index | Old Index