Subject: Re: Bootability eludes me once again
To: None <port-i386@netbsd.org>
From: Anne Bennett <anne@alcor.concordia.ca>
List: port-i386
Date: 05/01/2002 23:04:53
>> Hum, there's a problem here. If you want to boot from scsi, I guess
>> the first SCSI disk should appear as C: (80h) not D:.
>> It appears as D: because the BIOS already assigned C: to the IDE drive.

Definitely the hardware's start-up messages show the three SCSI disks
as 81h, 82h, 83h, which suggests that the IDE disk is 80h.  Since you
folks tell me that the boot code accepts the numbering set by the BIOS,
then we would expect to see what the floppy's boot code reports (with
"ls"), i.e. that the disks are ordered "wd0, sd0, sd1, sd2".  How it
is that the boot code loaded from sd1 reports "sd0, sd1, sd2, wd0"
is rather a mystery.

>> Reading though the rest of your message I guess what's the problem
>> is here is that the bios gets the disk numbers wrong: when set to boot from
>> SCSI it will load the boot sector of D:, but then claim to the boot sector
>> that it was loaded from C:
>> If your IDE disk isn't used for anything else but storing data in NetBSD
>> you can disable C: in BIOS (just claim there is no C: drive, don't disable 
>> the IDE controller). Then the SCSI BIOS should assign C: to the first SCSI
>> disk and the boot should work.
> 
> This could be a problem - would reqire that the BIOS is being
> inconsistent with its numberering scheme (which is clearly true
> since the 'boot from floppy' and 'boot from disk' options seem to
> give the disks different numbers.

In that case, they would be reported differently during the hardware's
start-up messages, right?  I'll be particularly vigilant about checking
for any changes in those messages.

> OTOH, pressing F4 found the bootselect code again,
> so the bios must be reading the correct disk (0x80 = scsi disk 0).
> Possibly it is passing in a different disk id to the mbr code,
> but that seems unlikely.....
> 
> My bets are still on the scsi bios using a different chs-lba
> translation.

Urg.  I just did "fdisk sd0" again.  Yesterday, it reported:

|  NetBSD disklabel disk geometry:
|  cylinders: 4826 heads: 4 sectors/track: 107 (428 sectors/cylinder)
|  
|  BIOS disk geometry:
|  cylinders: 1023 heads: 255 sectors/track: 63 (16065 sectors/cylinder)
|  
|  Partition table:
|  0: <UNUSED>
|  1: <UNUSED>
|  2: <UNUSED>
|  3: sysid 169 (NetBSD)
|      start 64, size 64 (0 MB), flag 0x80
|          beg: cylinder    0, head   1, sector  2
|          end: cylinder    0, head   2, sector  2

Just now, it reported:

|  NetBSD disklabel disk geometry:
|  cylinders: 4826 heads: 4 sectors/track: 107 (428 sectors/cylinder)
|  
|  BIOS disk geometry:
|  cylinders: 1010 heads: 64 sectors/track: 32 (2048 sectors/cylinder)
|  
|  Partition table:
|  0: <UNUSED>
|  1: <UNUSED>
|  2: <UNUSED>
|  3: sysid 169 (NetBSD)
|      start 64, size 64 (0 MB), flag 0x80
|          beg: cylinder    0, head   1, sector  2
|          end: cylinder    0, head   2, sector  2

Note the difference in the BIOS geometry.

Now, I'm not going to pretend that a cut-and-paste error on my part
is impossible.  However, I would have cut and paste these results
as a block, and none of my other disks have 4826 "NetBSD cylinders".
The NetBSD geometry suggests 4826*4*107 = 2065528 sectors.  In the case
of yesterday's BIOS geometry results, there are 1023*255*63 = 16434495
sectors (no match), whereas in the case reported just now, there are
1010*64*32 = 2068480 sectors, which is reasonably close.  Both cases
show sector 64 at CHS 0/1/2, which is correct if there are 63 sectors
per track (I think, though I would have thought that track 0 contained
sectors 0-62, with 63 as the first, #0, sector of track 1 (i.e. cyl.0,
head 0), and 64 as sector 1, not sector 2, of that track).

Definitely, what was reported just now (and I issued the command
repeatedly because I don't believe what I'm seeing), cannot be correct;
at 32 sectors/track, sector 64 has to be on head 2, not head 1).
Interestingly, "fdisk sd0a" also shows 32 sectors/track today, where
yesterday it showed 63 sectors/track.

I really don't think I dreamed this.  Assuming I'm not totally off my
rocker (no comments from the peanut gallery, please!), "fdisk" is
reporting a different number for sectors/track today than it did
yesterday.  I definitely did not try to mess with the values myself.

Something fishy is going on.  Ideas welcome.  I probably won't be
able to try anything new on this until the week-end.  In the meantime,
cue the Twilight Zone theme...

> I can possibly build a copy on the mbr_bootsel code with
> specific diagnostics and/or specific behaviour.
> (But there isn't much space to play with!)

I must admit that I wonder why CHS are used instead of block number from
the start of the disk -- I thought that disks were addressed by block
number anyway?

I suppose that if the CHS translation is the problem, then a workaround
would be to make sure that the first sector of the NetBSD partition is
on track zero of the disk (using the smaller of the BIOS and NetBSD
sectors/track number).  If the MBR is on sector zero only, then presumably
I can put my NetBSD partition as early as sector one, which leads me
to wonder why the default installation of NetBSD these days seems to
leave many sectors unused at the start of the disk.


Anne.
-- 
Ms. Anne Bennett, Senior Analyst, IITS, Concordia University, Montreal H3G 1M8
anne@alcor.concordia.ca                                        +1 514 848-7606