tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

re: Time to merge the pgoyette-compat branch



> >> * Introduction of module "aliases".  In addition to its own name, a
> >>    module can now provide alias names.  This is useful for the
> >>    monolithic compat module, which now contains the functionality of
> >>    the many version-specific modules.  If you load the monolithic module,
> >>    its aliases will prevent you from also loading individual-version
> >>    compat modules.
> >
> > i'm not sure i understand the point of this.  how does the normal
> > duplicate symbol detection not cause a failure?  why do we care
> > about names, vs what is actually imported or exported?  ie, what
> > happens differently if this change is excluded?
> 
> This issue arises in a situation where you have none of the compat
> code built in.  First you load (for example) the compat_70 module,
> and then you try to load the monolithic compat module.  You will
> of course get duplicate symbols, and the load will fail with EEXEC.
> But if you try to otherwise load a duplicate module, the load will
> fail with EEXIST.  The alias mechanism allows the monolithic compat
> module to have multiple names, including compat_70, so the load can
> fail with the (IMHO) more informative EEXIST - module already exists.

hm.  i don't know.  i don't think this is useful, as it is
additional checking that isn't needed, and i don't really
agree with "more informative".  if a module can't load
because a symbol already exists, i'd expect, eg EBUSY or
something more specific, than either ENOEXEC or EEXIST.

i'd really rather this part be not merged.

> >> * Removed linking of the .o kernel compat library into all kernels.
> >>    This caused problems, since the library included lots of compat
> >>    symbols, but did not include module linkage; attempts to subsequently
> >>    load some modules would fail due to multiply-defined symbols.
> >
> > can you expand upon this more?  what exactly has changed how?
> 
> When maxv did (one of) his rototill a while ago, he replaced the .a
> compat library with a .o library.  As I recall, the reason was that
> there were some shared routines that weren't strictly compat (I don't
> have the original discussion handy).  By using a .o library, we end up
> with much of the compat code included in the kernel even without the
> COMPAT_xx options.  (And the routines that caused maxv to do this have
> now been included in a module of their own, and are required by their
> callers.)
> 
> On the branch, we return to a .a library, only including in the
> kernel those things that are actually needed to resolve symbols in
> the kernel.

i think you've misunderstood why we have .o vs .a for kernel
libraries.  infact, the whole reason we have a .o version of
them is _only for modular kernels_.  if there is a problem
with a .o version for module kernels, then that library is
being built or used wrongly, perhaps even the code should be
elsewhere.

eg, the libcompat.{a,o} difference entirely has to do with
what ends up in the linked kernel image.  for modular kernels,
that are expected to be able to load modules, any support code
for them should be present, regardless of known use.  this is
why we use the library linked as a .o in this case normally.
for non-modular kernels, that are 100% statically linked, we
can build the kernel libraries as a .a, and let the linker
only include the parts we use.

so, if there is a problem with modular and the .o, then it
sounds like that there is a problem with the usage, not that
the linkage should be (IMO) broken.  your method sounds like
it _should_ work without this change, but i can imagine that
unclean usages will trigger the failures you've seen.


.mrg.


Home | Main Index | Thread Index | Old Index