Subject: Alpha MP goes multi-user, code committed to NetBSD-current
To: None <port-alpha@netbsd.org>
From: Jason R Thorpe <thorpej@zembu.com>
List: tech-smp
Date: 04/20/2001 11:41:36
Folks, today I got multiprocessor Alpha kernels to go multi-user,
running both user and kernel code on multiple processors.  It even
seems to be pretty stable (did a kernel build, doing a full userland
build right now, and will build an Alpha binary snapshot with the
MP kernel later).

Here's what top(1) looked like while doing a "make -j4" of a kernel:

load averages:  6.76,  4.92,  3.33                                     11:07:27
43 processes:  2 runnable, 39 sleeping, 2 on processor
CPU states: 54.674934.4  0.0% nice, 44.674937.2m,  0.0% interrupt,  0.81502.9.9
Memory: 143M Act, 5168K Wired, 751M Free, 2046M Swp free

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
 5133 thorpej   62    0  9296K 9208K RUN/1      0:01 45.47%  8.25% cc1
 5179 thorpej   62    0  6112K 6024K CPU/0      0:00 52.00%  2.54% cc1
 5178 thorpej   63    0  5320K 5232K RUN/1      0:00 34.00%  1.66% cc1
 4838 thorpej    2    0  1320K 2184K select/1   0:00  0.42%  0.29% make
 5170 thorpej   10    0   264K 1064K wait/1     0:00  1.03%  0.10% cc
 5174 thorpej   10    0   264K 1064K wait/0     0:00  2.00%  0.10% cc
 5177 thorpej   10    0  1272K  552K wait/0     0:00  1.00%  0.05% sh
 5169 thorpej   10    0  1272K  552K wait/0     0:00  0.51%  0.05% sh
  223 root       2    0   872K 2544K select/0   0:10  0.00%  0.00% sshd
  241 root       2    0   896K 2504K select/0   0:07  0.00%  0.00% sshd
  258 root       2    0   856K 2488K select/1   0:05  0.00%  0.00% sshd
 3134 thorpej    2    0    11M   12M select/1   0:03  0.00%  0.00% make
  275 thorpej   28    0   376K 1544K CPU/1      0:01  0.00%  0.00% top
  189 root      18  -12  1072K 4384K pause/1    0:00  0.00%  0.00% ntpd
  242 thorpej   18    0  1304K  720K pause/0    0:00  0.00%  0.00% ksh
  259 thorpej   18    0  1304K  712K pause/0    0:00  0.00%  0.00% ksh
 5126 thorpej   10    0   264K 1064K wait/1     0:00  0.00%  0.00% cc
  220 root      10    0   328K  992K nanosl/1   0:00  0.00%  0.00% cron
 4840 thorpej   10    0  1272K  552K wait/1     0:00  0.00%  0.00% sh
 5172 thorpej   10    0  1272K  552K wait/1     0:00  0.00%  0.00% sh
 5125 thorpej   10    0  1272K  552K wait/1     0:00  0.00%  0.00% sh
 4836 thorpej   10    0  1272K  552K wait/1     0:00  0.00%  0.00% sh
    1 root      10    0   848K  392K wait/1     0:00  0.00%  0.00% init
  222 root       3    0    96K 1120K ttyin/1    0:00  0.00%  0.00% getty
  225 thorpej    3    0  1304K  688K ttyin/0    0:00  0.00%  0.00% ksh
  107 root       2    0  1712K 2512K select/1   0:00  0.00%  0.00% rpcbind

Yes, I know some statistics are screwed up :-)  I'll work on fixing that
later.

All the code to make this work is in NetBSD-current.  I have done all
of my testing on an AlphaServer 1200.  This is the same systype as
the AlphaServer 4100, so MP configurations of that should also Just Work.

DEC/Alpha Processor, Inc. DP264 and UP2000/UP2000+ systems should also
work.  The API CS20 should work, too, as should Compaq DS20 sytems (they
are DP264 systems, as far as software is concerned).

Really, the only thing that might have a problem would be the
AlphaServer 8200/8400; they still use the PROM for console output,
and do a pretty evil thing with the level 1 page table.  That needs
to be fixed (either by doing a native serial port console driver,
or by fixing the kernel->PROM interface code) before that system
will work in an MP configuration.

There's definitely more work to do.  In particular, TLB invalidation
traffic has to be managed better.  Take a look at the frequence of
interprocessor interrupts:

frau-farbissina:thorpej 23$ vmstat -i 
interrupt                           total     rate
soft serial                           142        0
soft net                            10666        4
soft clock                          25113        9
cpu0 clock                        3063914     1204
cpu0 device                        220873       86
cpu0 ipi                          3166394     1244
cpu0 tbia ipi                        1263        0
cpu0 tbiap ipi                        987        0
cpu0 shootdown ipi                3126178     1228
cpu0 imb ipi                        60037       23
cpu0 ast ipi                            4        0
cpu0 synch fpu ipi                   1586        0
cpu0 discard fpu ipi                  277        0
cpu1 clock                        3049191     1198
cpu1 ipi                          2316222      910
cpu1 tbia ipi                        1958        0
cpu1 tbiap ipi                       1799        0
cpu1 shootdown ipi                2295031      902
cpu1 imb ipi                        35271       13
cpu1 ast ipi                            8        0
cpu1 synch fpu ipi                   2575        1
cpu1 discard fpu ipi                   30        0
kn300 irq 12                        23182        9
kn300 irq 36                           55        0
kn300 irq 40                       197498       77
isa irq 4                             142        0
Total                            17600396     6918
frau-farbissina:thorpej 24$ 

Note that TLB shootdown requests are about as frequent as the clock
interrupt :-/  Anyway, future work -- at least it runs :-)

I'll post again later when a new binary snapshot is ready.  For now,
I'm going to go reward myself with a steak burrito.  Below is dmesg
from the system.  Shar and enjoy.

-- 
        -- Jason R. Thorpe <thorpej@zembu.com>

[ using 324720 bytes of netbsd ELF symbol table ]
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 1.5U (FRAU-FARBISSINA.MP) #88: Fri Apr 20 10:42:55 PDT 2001
    thorpej@frau-farbissina.shagadelic.org:/u2/netbsd/src/sys/arch/alpha/compile/FRAU-FARBISSINA.MP
AlphaServer 1200 5/533 4MB, 531MHz, s/n NI83605791
8192 byte page size, 2 processors.
total memory = 1024 MB
(2072 KB reserved for PROM, 1021 MB used by NetBSD)
avail memory = 931 MB
using 6553 buffers containing 52424 KB of memory
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21164A-2
cpu0: Architecture extensions: 1<BWX>
cpu1 at mainbus0: ID 1, 21164A-2
mcbus0 at mainbus0
mcmem0 at mcbus0 mid 1: Memory
mcpcia0 at mcbus0 mid 5: PCI Bridge
mcpcia0: Horse Revision 3, Left Handed Saddle Revision 4, CAP Revision 2
pci0 at mcpcia0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
siop0 at pci0 dev 1 function 0: Symbios Logic 53c810 (fast scsi)
siop0: interrupting at kn300 irq 36
scsibus0 at siop0: 8 targets, 8 luns per target
isp0 at pci0 dev 2 function 0
isp0: interrupting at kn300 irq 40
scsibus1 at isp0: 16 targets, 8 luns per target
mcpcia1 at mcbus0 mid 4: PCI Bridge
mcpcia1: Horse Revision 3, Left Handed Saddle Revision 4, CAP Revision 2
pci1 at mcpcia1 bus 0
pci1: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pceb0 at pci1 dev 1 function 0: Intel 82375EB/SB PCI-EISA Bridge (PCEB) (rev. 0x15)
vga0 at pci1 dev 2 function 0: S3 Trio32/64 (rev. 0x54)
wsdisplay0 at vga0
tlp0 at pci1 dev 3 function 0: DECchip 21140A Ethernet, pass 2.2
tlp0: interrupting at kn300 irq 12
tlp0: DEC DE500-AA, Ethernet address 00:00:f8:1a:d8:f6
nsphy0 at tlp0 phy 5: DP83840 10/100 media interface, rev. 0
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
eisa0 at pceb0
tlp1 at eisa0 slot 1: DEC DE425 Ethernet, pass 2.3
tlp1: interrupting at isa irq 9
tlp1: Ethernet address 08:00:2b:93:c7:a7
tlp1: 10baseT, 10baseT-FDX, 10base5, manual
isa0 at pceb0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0
lpt0 at isa0 port 0x3bc-0x3bf irq 7
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
isabeep0 at pcppi0
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
mcbus0 mid 2: CPU 4MB BCache
stray kn300 irq 40
scsibus0: waiting 2 seconds for devices to settle...
cd0 at scsibus0 target 5 lun 0: <DEC, RRD46   (C) DEC, 1337> SCSI2 5/cdrom removable
siop0: target 5 now synchronous at 10.0Mhz, offset 8
scsibus1: waiting 2 seconds for devices to settle...
sd0 at scsibus1 target 0 lun 0: <DEC, RZ2DA-LA (C) DEC, N1H1> SCSI2 0/direct fixed
sd0: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
sd1 at scsibus1 target 1 lun 0: <DEC, RZ2DA-LA (C) DEC, N1H1> SCSI2 0/direct fixed
sd1: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
sd2 at scsibus1 target 2 lun 0: <DEC, RZ2DA-LA (C) DEC, N1H1> SCSI2 0/direct fixed
sd2: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
sd3 at scsibus1 target 3 lun 0: <DEC, RZ2DA-LA (C) DEC, N1H1> SCSI2 0/direct fixed
sd3: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
sd4 at scsibus1 target 4 lun 0: <DEC, RZ1CB-CA (C) DEC, LYJ0> SCSI2 0/direct fixed
sd4: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
sd5 at scsibus1 target 5 lun 0: <DEC, RZ1CB-CA (C) DEC, LYJ0> SCSI2 0/direct fixed
sd5: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
sd6 at scsibus1 target 6 lun 0: <DEC, RZ1CB-CA (C) DEC, LYJ0> SCSI2 0/direct fixed
sd6: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
root on sd0a dumps on sd0b
root file system type: ffs
cpu1: CPU 1 running
RAIDFRAME: protectedSectors is 64
raid0: Component /dev/sd1a being configured at row: 0 col: 0
         Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
         Version: 2 Serial Number: 22473 Mod Counter: 1666
         Clean: Yes Status: 0
raid0: Component /dev/sd2a being configured at row: 0 col: 1
         Row: 0 Column: 1 Num Rows: 1 Num Columns: 3
         Version: 2 Serial Number: 22473 Mod Counter: 1666
         Clean: Yes Status: 0
raid0: Component /dev/sd3a being configured at row: 0 col: 2
         Row: 0 Column: 2 Num Rows: 1 Num Columns: 3
         Version: 2 Serial Number: 22473 Mod Counter: 1666
         Clean: Yes Status: 0
RAIDFRAME: Configure (RAID Level 5): total number of sectors is 35540160 (17353 MB)
RAIDFRAME(RAID Level 5): Using 20 floating recon bufs with head sep limit 10