Subject: Re: devfs (was Share common code/data across ports?)
To: None <mrg@eterna.com.au>
From: Greg Earle <earle@isolar.Tujunga.CA.US>
List: current-users
Date: 01/14/1997 10:29:59
>     * ever tried to boot a solaris disk on a machine that it wasn't installed
>     * on, and wasn't configured basically identically ?  ever tried booting
>     * from a disk that previously didn't exist ?
>    
>    Rubbish:  boot -rs[v] and it should work just fine.
>    
>    You might need to supply the -b flag as well and perform all the startup
>    to single-user manually.
> 
> oh really?
> 
> here's what happened to me:
> 
> i installed on a disk when it was on controller 0, scsi id 3.  i
> actually wanted this to be controller 1, scsi id 1.  so, i moved
> it.  when i booted, it failed to fsck, couldn't mount root or
> user, and halted.  ok, i thought, `boot -rvsb' should fix this.
> uh uh, said solaris!
> 
> the problem was this:
> 
> there were no devices for c1t1.  ie, i could not mount c1t1.  as
> i couldn't mount the root filesystem, i couldn't _modify_ it.
> 
> for some reason i never determined, even booting from the cdrom,
> doing a chroot, devlinks, disks, etc. etc., wasn't enough to get
> the devices on that disks /devices and /dev.  i ended up having
> to reinstall the OS from scratch, with the disk in it's correct location.

This probably means your /etc/path_to_inst got biffed.

I just had the same thing happen to me recently; an Ultra 2 got stuck in a
frozen state and when I rebooted, it said "can't mount /usr" or somesuch.
Given that it was on the boot disk, I thought "This is insane, it's booted
the kernel from the same damn disk, I guess the top-level root inode of /usr
must have gotten mangled."  I booted from CD-ROM and fsck'ed the real /usr
with no problem, and mounted it just fine.  It eventually turned out that the
/etc/path_to_inst on the real root partition was no longer in the same state
as reality, and that killed everything.  I had to jump through some hoops,
running "drvconfig" and "devconfig" from the CD-ROM using the "-r" option
to point them to the mount point for the real root partition; even this didn't
work, because one of them complained that it couldn't frob "/etc/path_to_inst".
(i.e., using "-m" didn't make the program realize it should prepend the "-r"
 argument to "/etc/path_to_inst".)  I had to literally copy the "drvconfig"
binary to "/tmp" and adb it to change the "/etc/path_to_inst" string to
instead be "/tmp/path_to_inst", so I could generate a new "path_to_inst" file
that I could then copy to the real root partition.  Once I did that, I was
able to reboot the system and get a handle on it again.  Utter madness.

(Apologies to christos, cgd, pk and alanp for repeating the same tale I
 regaled them with in the hotel lounge.  Funny that it should bite mrg.)

Another time (same Ultra 2) I needed to add an FSBE/S SCSI card, and it was
too big to fit in the SBus slot I wanted (so much for standardized SBus cards).
I moved the Crescendo CDDI card and put the FSBE/S in the CDDI slot.  I then
rebooted with "-srv" and the machine hung trying to do an NFS mount.  Gee,
why?  Because the machine was only hooked up to the CDDI, and when I moved
the card, changing the slot assignment made the system think I had a new card
in there.  So it recreated the device as "cddi1" instead of "cddi0", and of
course the system depended on "cddi0" (/etc/hostname.cddi0, anyone?).  Duh ...

My point (and the NetBSD relevance): I don't like the idea of a /dev that's
created on the fly.  At least, not one that's created like Solaris' is.
It keeps state.  And if that state becomes inconsistent, it's a nightmare
trying to undo that state.  (Ponder the fact that mrg had to reinstall the
system.  This is mrg we're talking about here!!!)  Doing auto-configuration on
the fly seems like a nifty idea, and it works fine when it's working, but
heaven forbid if things break and you have to go underneath the hood.
I vote "No".

(All those who remember the Sun386i and SNAP, raise your hands)

	- Greg