Subject: Re: misc kern questions
To: Travis H. <solinym@gmail.com>
From: Quentin Garnier <cube@cubidou.net>
List: tech-kern
Date: 08/25/2006 12:48:02
--iCmA5YHpFVm8WME8
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Aug 24, 2006 at 11:30:51PM -0500, Travis H. wrote:
> On 8/24/06, Daniel Carosone <dan@geek.com.au> wrote:
> >We have a syscall versioning mechanism for exactly this purpose.
>=20
> Actually, I think I've dealt with this... I recall some funky sort of
> symbol renaming such that _execve becomes _execve13 or something like
> that.
>=20
> Can anyone describe succinctly how, say, Linux binary emulation works?

Binaries are identified at execve time, and the struct proc associated
to the newly created process contains a pointer to the struct emul
associated to the identified emulation.  NetBSD supports a lot of such
binary emulations, not just Linux, and guessing the emulation type is
not always easy.

See for example sys/compat/linux/common/linux_exec_elf32.c.

> URLs welcome.  Calls to RTFM just as welcome.  I'm hoping for
> something very low-level, like explaining that int 0x80 maps does
> this-or-that, whereas NetBSD natively uses int 0x81, or whatever.

You know, the world is not an i386.  The way a process enters the kernel
matters little, because what identifies it is its struct proc, which is
known at all time by the kernel through the curlwp variable:  the kernel
is in charge of switching processes, so it knows what's running.

When int 0x80 is issued by userland, the kernel uses the syscall handler
defined by the binary emulation.  If it's the native emulation, we use
syscall_plain (or _fancy if the process is traced);  in the Linux case,
it's redirected to linux_syscall_plain.

> Do the Linux syscalls actually implement the entire syscall, or do
> they rearrange arguments and call the native syscall?  Does NetBSD use

It really depends.  If we have a native syscall that is close enough,
then we use it, otherwise there's specific code to handle it.

> any Linux code?  IIRC Linux keeps most of the args in registers, and
> BSD copies them off the stack; how do we support both, just
> __asmlinkage__ the Linux syscalls and not the native?

In the end everything is copied as an array, so for the syscall handlers
it's pretty much the same in both native syscalls and Linux handlers.

> I assume that emulating Linux LKMs is not possible; does VMWare run

That's not technically impossible.  Certainly tedious, but not
impossible.

> natively?  How about Xen?

VMWare runs under linux emulation, but the kernel modules it uses have
NetBSD ports.  Well, used to, I don't think they're maintained anymore.

Xen is nothing specific to Linux.

--=20
Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
"When I find the controls, I'll go where I like, I'll know where I want
to be, but maybe for now I'll stay right here on a silent sea."
KT Tunstall, Silent Sea, Eye to the Telescope, 2004.

--iCmA5YHpFVm8WME8
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (NetBSD)

iQEVAwUBRO7VYtgoQloHrPnoAQIDGQf+Jf0gLB0ivg71bSPpg6dnYodYTweOSSvf
1fgZHUcJGnaLaQ70ATDVBUQrXx+9lLRVZfrQGX1Bk+AdJbkvQwoPryJbYT7bEJ6e
XCD+iEy9U0OLZi6fYoKaHxAEsbSjwK12NEWtxcZXEUuGsvP64VVPudJ0SezzzleT
0XI2yc8rvZgfdGFLw5/podqPa1fGYiMpAqJn7qdVmNnnPyl8QbxqWq8TeZFKF4Bj
gqZ9H+3EgL9d+4jn/5TwkRUhRxyVJln7EUvp4PtFjBL41+OGi+1OESX6bVD1P4wb
pGkGDqZG/mln8xW3vfnwV+aCPC+pFehuO+Xli+z3UupPGdo08IZ5/Q==
=I/o4
-----END PGP SIGNATURE-----

--iCmA5YHpFVm8WME8--