Subject: root on cardbus devices, a fix...
To: None <tech-kern@netbsd.org>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: tech-kern
Date: 01/23/2000 11:09:32
PR kern/9247 describes a problem where cardbus devices cannot be used as
the root device because the cardbus code uses threads to do card discovery,
and those threads don't run until after the root file system has been mounted.

Well, with some clever code rearranging and use of 2 semaphores, I have
fixed this problem.

The following are boot messages from my Dell laptop with cardbus ethernet
card:

NetBSD 1.4Q (DR-EVIL) #272: Sun Jan 23 10:27:41 PST 2000
    thorpej@dr-evil:/u1/netbsd/src/sys/arch/i386/compile/DR-EVIL
cpu0: family 6 model 6 step a
cpu0: Intel Pentium II (Celeron) (686-class)
total memory = 32320 KB
avail memory = 27196 KB
using 429 buffers containing 1716 KB of memory
BIOS32 rev. 0 found at 0xfd7f0
PCI BIOS rev. 2.1 found at 0xfd9f3
pcibios: config mechanism [1][x], special cycles [x][x], last bus 1
PCI IRQ Routing Table rev. 1.0 found at 0xfdf80, size 96 bytes (4 entries)
PCI Interrupt Router at 000:07:0 (Intel 82371FB PCI-to-ISA Bridge (PIIX))
--------------------------------------------
  device vendor product pin PIRQ   IRQ stage
--------------------------------------------
000:04:0 0x104c 0xac17  A   0x00   11  0
000:04:1 0x104c 0xac17  A   0x00   11  0
000:07:2 0x8086 0x7112  D   0x03   10  0
--------------------------------------------
PCI bridge 0: primary 0, secondary 1, subordinate 1
PCI bridge 1: primary 0, secondary 2, subordinate 2
PCI bridge 2: primary 0, secondary 3, subordinate 3
PCI bus #3 is the last bus
mainbus0 (root)
pnpbios0 at mainbus0: 18 nodes, max len 106
com0 at pnpbios0 index 12 (PNP0501)
com0: io 3f8-3ff, irq 4
com0: ns16550a, working fifo
lpt1 at pnpbios0 index 17 (PNP0400)
lpt1: io 378-37f, irq 7
wss0 at pnpbios0 index 18 (NMX2210)
wss0: io 220-22f 530-537 388-38f 320-321, irq 5, dma 0 1
wss0: CS4231 or AD1845
audio0 at wss0: full duplex, mmap
opl0 at wss0: model OPL3
midi0 at opl0: WSS Yamaha OPL3
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82443BX Host Bridge/Controller (rev. 0x03)
ppb0 at pci0 dev 1 function 0: Intel 82443BX AGP Interface (rev. 0x03)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga0 at pci1 dev 0 function 0: Neomagic MagicMedia 256AV VGA (rev. 0x20)
wsdisplay0 at vga0: console (80x25, vt100 emulation)
Neomagic MagicMedia 256AV Audio (audio multimedia, revision 0x20) at pci1 dev 0 function 1 not configured
cbb0 at pci0 dev 4 function 0 (TI1220), chipflags 3
cbb0: can't map socket base address 0x40000000
pci_io_find: expected type i/o, found mem
cbb0: can't map socket base address 0xf02f530c: io mode
cbb1 at pci0 dev 4 function 1 (TI1220), chipflags 3
pcib0 at pci0 dev 7 function 0
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
pciide0 at pci0 dev 7 function 1: Intel 82371AB IDE controller (PIIX4)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
wd0 at pciide0 channel 0 drive 0: <FUJITSU MHE2064AT>
wd0: drive supports 16-sector pio transfers, lba addressing
wd0: 6194MB, 13424 cyl, 15 head, 63 sec, 512 bytes/sect x 12685680 sectors
wd0: 32-bits data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfers)
pciide0: secondary channel wired to compatibility mode
pciide0: disabling secondary channel (no drives)
uhci0 at pci0 dev 7 function 2: Intel 82371AB USB Host Controller (PIIX4) (rev. 0x01)
uhci0: interrupting at irq 10
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous bridge, revision 0x02) at pci0 dev 7 function 3 not configured
cbb0: interrupting at irq 11
cbb0: cacheline 0x0 lattimer 0x20
cbb0: bhlc 0x821000 lscp 0x20020200
pccbb_pcmcia_write t=1 h=fafe2000 r=2053 v=0
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 2 device 0 cacheline 0x0, lattimer 0x20
pcmcia0 at cardslot0
cbb1: interrupting at irq 11
cbb1: cacheline 0x0 lattimer 0x20
cbb1: bhlc 0x821000 lscp 0x20030300
pccbb_pcmcia_write t=1 h=fafdb000 r=2053 v=0
cardslot1 at cbb1 slot 1 flags 0
cardbus1 at cardslot1: bus 3 device 0 cacheline 0x0, lattimer 0x20
pcmcia1 at cardslot1
isa0 at pcib0
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
midi1 at pcppi0: PC speaker
spkr0 at pcppi0
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
fd1 at fdc0 drive 1: density unknown
apm0 at mainbus0: Power Management spec V1.2
apm0: battery life expectancy: 85%
apm0: A/C state: off
apm0: battery charge state: high
apm0: estimated 5h 42m
biomask ef4d netmask ef4d ttymask ffcf
IPsec: Initialized Security Association Processing.
ex0 at cardbus0 dev 0 function 0: 3Com 3c575-TX Ethernet
ex0: interrupting at 11
ex0: MAC address 00:60:08:03:6e:ad
tqphy0 at ex0 phy 0: 78Q2120 10/100 media interface, rev. 3
tqphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ex0: supplying EUI64: 00:60:08:ff:fe:03:6e:ad
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
wsmux1: connecting to wsdisplay0
ex0: starting DAD for fe80:0009::0260:08ff:fe03:6ead
ex0: DAD complete for fe80:0009::0260:08ff:fe03:6ead - no duplicates found

Note that the thread which attaches "ex0" ran before the root file system
was mounted.

It works by adding 2 sempahores: config_pending and start_init_exec.  main()
forks initproc, which waits for start_init_exec to be released before it
actually execs init(8).  This preserves "init is process 1" semantics.  Then
main() forks off the other kthreads that device drivers have requested.

Device drivers have incremented the config_pending semaphore as needed, and
main() now waits for that semaphore to be released before attempting to mount
the root file system.  As the discovery threads for drivers do their things
for the first time, they decrement the config_pending semaphore.  When it
reaches 0, main() is awakened, and it mounts the root file system.

Now that root has been mounted, the process start times are fixed up (as
time may have been reset by the root file system timestamp), and the CWD
info for the kthreads and initproc are fixed up to properly reference the
rootvnode.

At this point, main() forks off the pagedaemon (needs the root file system
to be mounted in uvm_swap_init() which must be done before pagedaemon starts),
reaper, and ioflush, then releases the start_init_exec semaphore before
entering the swapper loop.

Now initproc will exec init, and away we go.

It's all very simple really :-)

BTW, this will also be used, eventually, to allow RAIDframe to be the
root device (RAIDframe requires threads).

Anyhow, if people are okay with this solution, I'll commit it soonish.  Note
I have only changed the cardbus code to use it, but changing e.g. USB should
be a trivial matter (so you can mount a USB Zip drive as root, or NFS root over
USB Ethernet!  :-)

        -- Jason R. Thorpe <thorpej@nas.nasa.gov>