NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: disk geometry (i386/amd64)



On 9/9/2018 1:52 PM, Michael van Elst wrote:
On Sun, Sep 09, 2018 at 12:08:14PM -0700, Don NetBSD wrote:
Said another way, are these "in-kernel" values (which no longer reflect
the physical medium) ever reported in other system calls/ioctls/etc.
INSTEAD of the "real" values?

What is 'real' ? Start with the assumption that there is no way to know
the real data and that the disklabel is the method to record such
information. 'real' is then what you wrote into the disklabel.

The disk has a real size that can be queried from the actual device.
Admittedly, things like the actual "geometry" are a moot point.  But,
the size is something that the system and the drive have to agree on
(whether you care about overprovisioned sectors isn't important at
this point)

Obviously, if drvctl(8) can report the "real" values, then they must be
preserved somewhere besides the in-kernel disklabel.

Modern drives can be queried about the disk geometry. The driver does
this and drvctl can be used to query the values collected by the driver.

The in-kernel disklabel is either generated ("default label") or read
from the on-disk label sector and sanitized a bit.

With a "foreign" drive, you can't count on the media to contain any meaningful
data in any particular places.  E.g., imagine "something" has treated the drive
as N "blocks of memory" and used them however IT deemed fit -- with no concern
for NetBSD (or *any* OS).  I.e., there was no concept of a "disk label", MBR,
etc. prior to the disk coming to the NetBSD machine.

I wanted assurance that I could access the 'd' partition and be assured
access to the entire medium, regardless as to what MIGHT have been encountered
in the sectors of the medium that NetBSD examines in search of a label.

Further, I wanted to know that I could query the OS for details as to the
size of the medium (sector size and number of sectors).  Again, without relying
on the media to have been preinitialized *or* requiring it to be "initialized"
prior to my access. (I *won't* be writing a "disklabel" onto the medium but will be altering much/all of its contents, otherwise)

Erasing an on-disk label is done with 'disklabel -D'. The kernel will then
use a generated default label.

Writing a few sectors of garbage to /dev/rsd#d (seeking to start of file)
won't overwrite the label.

The label sector is normally write-protected and only accessible through
special ioctls.

Yes, DIOCWLABEL.  My plan was to DIOCGDINFO to get the *information* from the
(fictitious) disk label and then DIOCWLABEL to enable access to the entire
medium.  Then, open() the character/raw device and seek/write to my heart's
content.

But, this counts on the DIOCGDINFO ioctl giving me unadulterated information
regardless of the previous contents of the disk -- or, any disk that may have
previously been probed at that device (i.e., a drive that has since been
ejected).

Verify this by copying one disk to another:
     dd if=/dev/rsd0d of=/dev/rsd1d bs=1024k
and verifying the label of the destination disk isn't altered (?)

If there is no label on the disk, then you will get an artificial
'default label'. If there is a label on the disk, it will be read
and used by the first opener of the device, so disklabel would indeed
show the altered values.

So, given that the previous contents of the disk are indeterminate, I
can't count on the information returned from the *medium* (hence I
need access to the information returned from the *drive* -- the parameters
that the drive's onboard controller uses and not "data" that happens to
be stored in some particular place on a platter)

Bottom line:  I'm trying to expose the "native" (avoiding the use of the
word "raw" to minimize the association with the "character device") disk
interface to my code so I can put what I want

The "native" disk interface is what you use and has its limits.
Fortunately this is only relevant to a few low-level tools and
several additional interfaces were added to address the shortcomings.

To access the raw disk nowadays you would:

- use opendisk() to get a filedescriptor.
- use ioctl(...,DIOCGSECTORSIZE,...) to get the byte size of a disk sectors.
- use ioctl(...,DIOCGMEDIASIZE,...) to get the number of disk sectors.

A quick grep of the sources (7.1/amd64) don't turn up any hits for these
ioctls.  I had planned on DIOCGDINFO -- with all the caveats mentioned
above.  Perhaps DIOCGDISKINFO would be a better choice?

- use lseek/read or pread to read sectors
- use lseek/write or pwrite to write sectors
- use ioctl(...,DIOCCACHESYNC,...) to force the disk to write cached data.
- close() the descriptor when done.

But if you do this and change how the OS sees the disk (i.e. change
partitioning or similar), you'd just create confusion.

There's no confusion as no one/nothing looks at the disk besides my software.

Typical usage:
- insert N drives of varying specifications into appliance
- indicate readiness of each drive as it is inserted (keystroke/mouseclick)
- N instances of software characterizes each drive, queries database
  regarding how each *individual* drive should be initialized, etc.
- software initializes each drive, taking varying amounts of time based
  on the initialization requirements, amount of data to be moved, etc.
  (i.e., some drives will be "done" before others)
- as each drive is "finished", update database, print label to affix to
  drive plus routing ticket
- spin down drive and inform operator of each drive becoming finished
  (along with any notification for future processing, reject, etc.)
- when notified, operator ejects drive, affixes label and routing tag
- as a slot is now vacant, operator inserts another drive and "starts"
  the process for that drive (mouse click/keystroke)

Neither NetBSD nor any other "app" ever looks at the drive.  We're
just using NetBSD -- and COTS hardware -- as an expedient.

We'd like to build several different "appliances" using COTS hardware
as an expedient.  Admittedly, this approaches doesn't scale very well
and is power hungry.  But, it can be up and running long before we
throw together some hardware fixtures to do this more economically.

as I don't expect NetBSD to "use" the disk in any way).  And, trying to
figure out what contractual guarantees I can get from the OS to stay out
of my way in doing so.

You apparently want to bypass the OS. Don't do that.

See above.

OK.  Again, I can look through the source to see what the actual mechanism
is (to reproduce it in my code).

Hopefully not. The actual mechanism might not even be close to what you
can assume as 'guaranteed'.

Do you happen to know if SATA/SAS are *inherently* hotpluggable?  Or, do
they need *hardware* support on the motherboard/backplane to do this?

Hotplug is an optional feature. For one you want special connectors that
connect ground first and disconnect ground last. Normal SATA won't have
that but eSATA and also special backplanes have that. For two, a SATA
controller with hotplug can detect when a disk is unplugged and plugged
similar to USB and signal the driver about it. But that feature isn't
supported by the NetBSD driver yet.

All of the drives I've seen have longer ground fingers (than power/signal).
We'd be using repurposed disk shelfs and not "consumer kit" to host the drives.

I'm not concerned with automatically detecting insertion/removal; that's
the job that the operator performs (above) -- along with the tagging of
the media, etc.

But, I'd be concerned about pulling a drive that was still *spinning*.
atactl(8) doesn't seem to work with sd(4) devices.  I'll have to see if
drvctl -d spins the drive down as it is disconnected (and maybe a timeout
to ensure the operator doesn't remove the drive before its had a chance
to spin down sufficiently)


Home | Main Index | Thread Index | Old Index