Subject: Re: Device minor numbers conversion in COMPAT_NETBSD32
To: Quentin Garnier <cube@cubidou.net>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 01/02/2006 18:55:30
--DocE+STaALJfprDB
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Jan 01, 2006 at 02:32:00AM +0100, Quentin Garnier wrote:

[snip]

> I.e., on i386:
>=20
> % ls -l /emul/netbsd32/dev/wd1[gi]
> brw-r----- 0, 14     /emul/netbsd32/dev/wd1g
> brw-r----- 0, 524296 /emul/netbsd32/dev/wd1i
>=20
> Whereas, on amd64:
>=20
> % ls -l /dev/wd1[gi]
> brw-r----- 0, 22 /dev/wd1g
> brw-r----- 0, 24 /dev/wd1i

Ugh.

> I noticed that trying to boot an amd64 kernel over an i386 partition
> which was on wd1g.  Booting an installed i386 partition with an amd64
> kernel is something I'd really like to achieve, but this issue actually
> makes it difficult.
>=20
> So my first question is, do we want to allow this?  I.e., using a /dev
> populated by an i386 MAKEDEV with an amd64 kernel.

Unfortunately we can't do that. The /def formats aren't compatible.

> If no, it settles the issue, but I already said what my position is.
>=20
> If yes, the question is about where to do the conversion?  I've thought
> a bit about that and it's more complex than it seems if a file
> descriptor is passed across emulation boundaries.

The file descriptor won't matter. By that point we've already figured out=
=20
the device, so the major & minor numbers don't count.

> First, I suggest adding a field to struct emul that points to a
> conversion function.  Easy enough.  But it still leaves aside the
> question of when doing the conversion.
>=20
> If the conversion is done at each syscall, it will cause troubles when
> the file descriptor is passed from a native process to a netbsd32
> process, or the other way around:  the second process will try using a
> different device, which wasn't opened.
>=20
> An other solution is to tag the vnode with the real device number at
> the time it was open.  The emulation (native or netbsd32) of the process
> opening the vnode will decide what is the actual device.  This will
> cause is troubles when the device is opened several times concurrently.

We already effectively do this; we resolve the major & minor at open.

> At this time I tend to favour the last solution, as it introduces less
> tests in specfs code (and the few other places that uses the devsw
> structs, like in uvm_vnode.c), which means it's less intrusive.  Also,
> very few devices will need that hack, and those that are relevant are
> not likely to be shared between native and netbsd32 processes.
>=20
> Of course, there's still the solution of changing the way minor numbers
> are allocated in either arch.  But it would get us a lot of angry
> users...

Unfortunately it's too late.

The problem with what you propose is that you're assuming that amd64=20
binaries will only see amd64 /dev, and i386 binaries will only see=20
i386 /dev. Thus we can convert based on emulation. However if we only have=
=20
one /dev, which is what we would need to do to boot an amd64 kernel over=20
an i386 partition, we lose. The problem is that we need to know what kind=
=20
of /dev we have, not what kind of binary is opening it. :-(

Take care,

Bill

--DocE+STaALJfprDB
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFDueeiWz+3JHUci9cRAtvZAJ4orRfdjh0G8A3pYKnmt01m4KM0xACdGDHv
pYy0nVbEwCA1G5edi39xfYc=
=/Tgp
-----END PGP SIGNATURE-----

--DocE+STaALJfprDB--