Subject: Re: What's a "real" elf loader like ?
To: Quentin Garnier <cube@cubidou.net>
From: Cherry G. Mathew <cherry.g.mathew@gmail.com>
List: tech-kern
Date: 09/23/2006 13:05:03
Hi Quentin,

Sorry about the long delay, its been on my todo list to investigate
this further.

A few questions/comments:

On 6/17/06, Quentin Garnier <cube@cubidou.net> wrote:
> [Sorry, this is longish, but it's been on my mind for a while.]
>
> On Fri, Jun 16, 2006 at 04:01:59AM +0530, Cherry G. Mathew wrote:
> > I was wondering if someone could put down briefly and to the point,
> > what the NetBSD "elf" related shortcomings are.
> >
> > Here's my shallow understanding of the situation.
> >
> > - libsa has no standalone support for kernel modules, does not
> > understand all elf header types.
> > - There are shortcomings related to kernel module linking, which uses
> > userspace ld ( for what ? How ? )
>
> Your question is not about having an ELF loader in the kernel (and maybe
> also in libsa--but that's another debate), it is about the current
> status of kernel modules.
>
> Frankly, our LKMs suck.  The dependency on ld(1) is terrible, because
> it means LKMs depend on comp.tgz, even though modload(8) is in /sbin.
> That's the main problem we have to fix.
>
> The other problem is that lkm(4) is from another age, and has several
> shortcomings.  If you want a module that has several roles, you'll have
> to do a lot by yourself and be careful when loading and unloading.  This
> interface also has no knowledge whatsoever of dependency between
> modules;  it will happily let you unload a module that other loaded
> modules depend on.
>
> ksyms(4) also has a few issues when dealing with LKMs.  There's an open
> PR about its incredible slowness, for instance.
>
> However, the current LKM scheme is very simple:  modules are relocatable
> ELF files, which are linked to the kernel after modload(8) has retrieved
> a base address from the kernel.
>
> I've been thinking of a new LKM system for a while now, and now I'll
> describe what I would like to see, what are the inconveniences of it,
> and I'll compare it briefly with FreeBSD's KLD.
>
> Just like Eric (and completely independently), I made a patch to
> config(1) something like two years ago to allow specification of modules
> directly from inside the kernel tree.  The idea was to have as many
> modules as possible.
>
> Having good support for modules is a good thing nowadays, but in some
> environments, you really want to avoid them.  My idea at the time (it is
> still mostly the same) was to have modules as relocatable files.  Then
> you'd just link them together to build the kernel binary.  I think this
> would be quite neat, actually:  we would distribute a minimal kernel and
> all the modules, and if the user wants to have a monolithic kernel that
> only has the drivers for his hardware, he'd just have to re-do the final
> link stage with the relevant modules, and drop the rest.  No need to
> fetch the sources and recompile.  Adding a 3rd-party binary module would
> work the same way.  That's why I like relocatable images, it's a very
> flexible way to manage modules.
>
> In my config(1) patch, the granularity for the modularity was the
> attribute.  I don't remember if I had started allowing modularity for
> options and related (deffs), but it was at least planned.  I should have
> the code somewhere anyway.
>
> The kernel options description files could have stuff like this:
>
>     modular defattr foo
>
>     modular device bar
>
>     device baz
>     modular attach baz at fol with baz_fol
>
> which would mark repectively the attributes foo, bar and baz_fol as
> potentially modular.  Then the user would have e.g.:
>
>     module baz* at fol?
>
> which would have the module baz_fol.o created.
>
> If a file depends on more than one attribute (all have to be modular to
> produce a module), the module would group the attributes together (and
> all files depending on any subset of those attributes) into a module
> creatively named.  That was the weak part, but a long time goal is to
> separate more clearly sources to have less dependency of that kind.
> Modularity is not a friend of #ifdefs.
>
> So, that's what I had in mind for the past couple of years.  Now, the
> subject of the thread is about an ELF loader, right?  Ok, ok.
>
> Code to load an ELF file is not difficult to write per se.  The real
> question is to know what kind of file we want to load, because we can't
> just load any ELF file.  Relocatable files are nice because you can
> really load them about anywhere, section by section.  Static executable
> files are not suitable for modules because you can't relocate them,
> something that is not acceptable in the kernel context.  Shared objects
> are relocatable too, and makes the relocator a bit simpler.  However,
> you can't group shared objects together to build a new one.
>
> Right now we have two needs:  an ELF loader, and a relocator.  Each of
> them kind of depend on what the other intend to deal with.
>
> The past few days I started working on writing an in-kernel ELF loader.
> I have several train journeys planned for the next two weeks so I will
> have some time to work on this :-) (It's just the loader;  someone else
> started working on a relocator.)
>
> We actually have a third need, the module manager itself.  Something has
> to handle dependencies, initialise the modules and so on.  Information
> about all this has to be stored somewhere.
>
> Link sets are attractive to store that information, but as it
> potentially burns a lot of sections, I'd rather have them disposable as
> soon as the module is done using them.  For that I thought of having a
> program header for the module code and data, the resident part, and one
> for the disposable sections, which are used only during initialisation.
>





AFAIU, KLD reads just as much of the module as required. I'm not sure
what you meant by "discarding". Wouldn't that be a simple
malloc/free() ? On top of that, the MD sections ( like ia64 unwind
section for eg: ) can be loaded by MD parts of the loader.





> Of course, ld -r is unable to produce files with program headers (and,
> why not?  an entry point), so an external tool would be needed, and then
> linking together modules would be slightly harder.
>
> Then there is the issue of symbol visibility.  One thing that will have
> to be defined at some point is what is the kernel API for modules.
> Currently, everything is potentially part of the API, so it is very hard
> to keep track of the changes.
>
> FreeBSD went the dynamic way with its KLD system.  The kernel itself is
> a dynamic executable, and modules are shared objects.  However, the only
> use made of this, I think, is the more simple relocator.  For instance,
> module dependencies are not expressed in terms of NEEDED entries in the
> .dynamic section.  (Not that they necessarily should;  it's just
> something the mere nature of a dynamic object offers.)
>
> One thing to note is that with relocatable files, there are a few archs
> that are troublesome, namely arm and ppc.  But automatically generating
> trampoline for those is doable, and not that difficult.  However, it's a
> problem we'd not have with shared objects.
>
> My current plan is to continue experimenting manipulation of relocatable
> files for a while to see if I can end up with something satisfactory on
> accounts of memory requirements, ease of use and modules management.  I
> will always have the potential solution of special-casing sections
> depending on their name, even though it introduces to much knowledge in
> the ELF loader itself.
>
> Oh, and one last thing.  The syscall to load a module will take two
> const char * arguments:  the name of the file to load, and a proplib
> dictionary, to be used to pass parameters.
>
> I hope this answers your questions about the state of this, and with
> input from other people we'll end up with a clear objective very soon.


Just one other question:

How much code do we have within the NetBSD source tree which can be
re-used or library-fied ? Eg: the sharedlibrary relocator
/libexec/ld.so , the kern/exec_elf*.c etc. How realistic is it to try
to re-use code from both ?

-- 
~Cherry