Subject: L2 cache problem symptom? (Was Re: newfs/vnd problems -- cg 0: bad magic number)
To: None <port-sgimips@netbsd.org>
From: sgimips NetBSD list <sgimips@mrynet.com>
List: port-sgimips
Date: 03/20/2002 08:23:26
I spoke too soon...

Previously, on  03/19/2002 09:12:59, I wrote:

> Previously, I had reported problems using the vnd device for generating
> a filesystem image on disk.  I had determined that it was only an issue
> when operating on a filesystem image file over NFS.
> 
> As of the 1.5ZB kernel, this problem has gone away.
> 
> Thanks to all who, directly or indirectly, resolved this issue :)

As it turns out, the problem doesn't occur with the 1.5ZB kernel
on an INDY R5K with L2 cache forcefully disabled.

The problem is STILL occurring on my R4400 with L2 still enabled.

The only difference between the two kernels is the following change to 
ip22.c and cpu.c:

mod80 (409)# diff -ruN cpu.c.old cpu.c
--- cpu.c.old   Wed Mar 20 08:17:08 2002
+++ cpu.c       Sun Mar 17 09:00:49 2002
@@ -71,8 +71,10 @@
        case COMPONENT_TYPE_SecondaryDCache:
        case COMPONENT_TYPE_SecondaryCache:
                mips_sdcache_size = COMPONENT_KEY_Cache_CacheSize(comp->Key);
+#ifdef L2_CACHE_ACTUALLY_WORKS
                mips_sdcache_line_size =
                    COMPONENT_KEY_Cache_LineSize(comp->Key);
+#endif
                /* XXX */
                mips_sdcache_ways = 1;
                break;
mod80 (410)# diff -ruN ip22.c.old ip22.c
--- ip22.c.old  Wed Mar 20 08:17:09 2002
+++ ip22.c      Sun Mar 17 09:00:49 2002
@@ -457,11 +457,15 @@
         * If we don't have an R4000-style cache, then initialize the
         * IP22 SysAD L2 cache.
         */
+#ifdef L2_CACHE_ACTUALLY_WORKS
        if (mips_sdcache_line_size == 0) {
+#endif
                /* XXX */
                printf("%s: disabling IP22 SysAD L2 cache\n", self->dv_xname);
                ip22_sdcache_disable();
+#ifdef L2_CACHE_ACTUALLY_WORKS
        }
+#endif
 }
 
 #endif /* IP22 */


I'm going to check the kernel on an R4000 and an R4600 later this week.  I'll
also try the L2 cache-disabled kernel on the R4400 to see if that effects a
change.

Again, to describe the problem, the failure occurs only when operating on a
filesystem image (vnd type) on an NFS-mounted filesystem.  It does not happen
on a local filesystem.

Typical test:

(Working kernel test)

mod80# dd if=/dev/zero of=ramdisk.fs count=6144
6144+0 records in
6144+0 records out
3145728 bytes transferred in 1.276 secs (2465304 bytes/sec)
mod80# vnconfig -v -c vnd0 ramdisk.fs
/dev/rvnd0c: 3145728 bytes on ramdisk.fs
mod80# Mar 20 08:21:26 mod81 last message repeated 2 times
newfs -B le -m 0 -o space -i 5120 /dev/rvnd0a
/dev/rvnd0a:    6144 sectors in 3 cylinders of 64 tracks, 32 sectors
        3.0MB in 1 cyl groups (12 c/g, 12.00MB/g, 640 i/g)
super-block backups (for fsck -b #) at:
 32,
mod80#

(non-working kernel test)

mod81# dd if=/dev/zero of=ramdisk.fs count=6144
6144+0 records in
6144+0 records out
3145728 bytes transferred in 3.333 secs (943812 bytes/sec)
mod81# vnconfig -v -c vnd0 ramdisk.fs
vnd0: no disk label
/dev/rvnd0c: 3145728 bytes on ramdisk.fs
Mar 20 07:57:07 mod81 /netbsd: vnd0: no disk label
mod81# newfs -B le -m 0 -o space -i 5120 /dev/rvnd0a
vnd0: no disk label
/dev/rvnd0a:    6144 sectors in 3 cylinders of 64 tracks, 32 sectors
        3.0MB in 1 cyl groups (12 c/g, 12.00MB/g, 640 i/g)
super-block backups (for fsck -b #) at:
 32,
cg 0: bad magic number
mod81# 

More later as I discover...

-scott