Subject: re: smparc - success with mixed modules?!
To: Paul Kranenburg <>
From: matthew green <>
List: port-sparc
Date: 01/24/2003 03:10:14
   I'm curious how far it gets though. I've removed a couple of the
   all-caches-are-equal assumptions.
   At least one of the remaining (in fact, the only one I can spot right now)
   can be circumvented by putting the module with the largest cache in 
   MBus slot #0.  It appears you have already done that.
   So, modulo all of the other hypersparc bugs left, that would not be
   a showstopper.

i decided to try this out.  it seems to work fine.  it was annoying
taking both dual hs100 modules out though.  ;-)

mainbus0 (root): SUNW,SPARCstation-10
cpu0 at mainbus0: mid 8: RT620/625 @ 150 MHz, on-chip FPU
cpu0: 512K byte write-back, 32 bytes/line, sw flush: cache enabled
cpu1 at mainbus0: mid 10: RT620/625 @ 100 MHz, on-chip FPU
cpu1: 256K byte write-back, 64 bytes/line, sw flush: cache enabled
cpu2 at mainbus0: mid 11: RT620/625 @ 100 MHz, on-chip FPU
cpu2: 256K byte write-back, 64 bytes/line, sw flush: cache enabled

evil-bro-39 ~> top -b
load averages:  8.52,  2.67,  1.08    15:33:18
57 processes:  8 runnable, 46 sleeping, 3 on processor

Memory: 45M Act, 2272K Wired, 6984K Exec, 6804K File, 281M Free
Swap: 399M Total, 399M Free

  338 mrg       64    0  4236K 4744K RUN/0      0:19 37.05% 34.91% cc1
  336 mrg       63    0  3688K 4232K RUN/0      0:15 26.11% 24.61% cc1
  334 mrg       64    0  4472K 4984K RUN/1      0:16 24.02% 22.71% cc1
  380 mrg       63    0  4104K 4612K RUN/0      0:06 37.08% 22.02% cc1
  373 mrg       63    0  3604K 4120K RUN/2      0:05 25.53% 17.04% cc1
  384 mrg       63    0  3380K 3892K RUN/0      0:03 21.54% 11.38% cc1
  389 mrg       64    0  3316K 3828K RUN/2      0:02 16.29%  5.91% cc1
    4 root     -18    0     0K   42M reaper/1   0:10  4.59%  4.59% [reaper]
  399 mrg       63    0   648K 1152K CPU/0      0:00 24.16%  3.37% cpp0
  349 mrg       18    0  1460K 2000K pause/2    0:01  3.54%  3.12% tcsh
  398 mrg       10    0   212K  716K wait/2     0:00 14.26%  2.59% sh
  401 mrg       63    0   472K 1184K CPU/1      0:00 24.60%  2.34% as
    5 root      18    0     0K   42M syncer/0   0:02  1.66%  1.66% [ioflush]
  394 mrg       10    0   136K  608K wait/0     0:00  3.58%  0.93% gcc
  393 mrg       10    0   212K  716K wait/2     0:00  3.01%  0.78% sh
    6 root     -18    0     0K   42M aiodon/1   0:00  0.63%  0.63% [aiodoned]
  382 mrg       10    0   212K  716K wait/1     0:00  0.97%  0.54% sh
  400 mrg       10    0   136K  608K wait/2     0:00  5.13%  0.49% gcc

hmmmm, seems the 2xhs100 + hs150 is roughly as fast as the 4xhs100,
i wonder if that is the GiantLock law of dimishing returns or not.. :)


ps: i have this patch to make hypersparc MP not fail, but as it
disables the icache, there is a performance hit (it's not _that_
drastic though, certainly usable)

Index: sparc/cache.c
RCS file: /cvsroot/src/sys/arch/sparc/sparc/cache.c,v
retrieving revision 1.77
diff -p -r1.77 cache.c
*** sparc/cache.c	2003/01/20 22:15:54	1.77
--- sparc/cache.c	2003/01/23 13:52:46
*************** hypersparc_cache_enable()
*** 224,235 ****
  	if (CACHEINFO.c_hwflush)
  		panic("cache_enable: can't handle 4M with hw-flush cache");
  	 * Enable instruction cache and, on single-processor machines,
  	 * disable `Unimplemented Flush Traps'.
  #if defined(MULTIPROCESSOR)
--- 224,239 ----
  	if (CACHEINFO.c_hwflush)
  		panic("cache_enable: can't handle 4M with hw-flush cache");
+ ls = CACHEINFO.ic_linesize;
+ ts = CACHEINFO.ic_totalsize;
+ for (i = 0; i < ts; i += ls)
+   sta(i, ASI_ICACHETAG, 0);
  	 * Enable instruction cache and, on single-processor machines,
  	 * disable `Unimplemented Flush Traps'.
  #if defined(MULTIPROCESSOR)
! 	v = /*HYPERSPARC_ICCR_ICE |*/ (/*ncpu == 1 ? HYPERSPARC_ICCR_FTD :*/ 0);
  if ttthat#endif