Subject: Re: bin/12017: how to enable multibyte locale (and problem around it)
To: Noriyuki Soda <soda@sra.co.jp>
From: Greg A. Woods <woods@weird.com>
List: netbsd-bugs
Date: 01/24/2001 04:28:50
[ On Wednesday, January 24, 2001 at 13:23:24 ( +0900), Noriyuki Soda wrote: ]
> Subject: Re: bin/12017: how to enable multibyte locale (and problem around it)
>
> > 	if there's dlopen() call, we cannot predict what kind of code will be
> > 	loaded into, hence then cannot test enough.
> > 
> > 	well, the decision is up to people using the system, not us.
> 
> That doesn't make sense for me.
> User can predict the locale module by setting LC_* environment
> variables.
> 
> If a user cannot control their environment variables,
> then something is seriously broken in the environment.

Well yes the user can certianly control their environment variables.
But that is the problem -- the administrator, and more importantly the
code itself, cannot trust that the user is not trying to subvert the
system by supplying unauthorised values for their environment variables.
Certianly a set-ID process could clobber its environment and thus avoid
risking execution of unauthorised code, but in that case it seems
pointless to even offer such programs the capability of run-time
extensibility for something like i18n since it would never be possible
to use it.

In other words, yes, it is necessary to have a way ensure that
explicitly statically linked binaries cannot ever dynamically load code
in any way whatsoever, neither by their own explicit choice, and never
under the control of the user's environment, not even if all that
control provides is a boolean value.  This must be something that can be
enforced at link time at the very latest.

Dynamic loading of code presents the certain risk that unauthorised code
may be loaded and executed and the mere possibility of this happening
must be prevented in some environments, at least for sensitive binaries
that could in some way provide enhanced privileges or capabilities to
unathorised users (eg. even in binaries that may be executed by
privileged users, but be influenced by unprivileged users in some way).
In theory this possibility is no different, and is no less dangerous,
than a buffer overflow.  In Unix the kernel trusts that binaries it is
loading and executing have been appropriately protected by the ACLs on
the filesystem (especially in the case where they are being executed by
the privileged user, or in the case where they are set-ID), which is
already a tenuous relationship at best.  Any mechanism that gives the
user even a tiny amount more control over alternate or subordinate code
segments which are also loaded and executed (but now possibly even by
the process, not the kernel) increases the risk that unauthorised code
will be executed.

(Which is why kernel mode shared libraries ala ATT SysVr2 are, at least
in theory, safer.  They provide for more diverse sharing of code across
binaries and also smaller on-disk images and faster load times, while at
the same time using the kernel to pre-load hopefully secure shared
images in the same way the main process images are loaded and user-level
code has no control over the loading at run-time -- it just runs *after*
it has been loaded.)

> We don't have to provide dlopen()-less library for dynamically linked
> binaries for same reason.

I have no disagreement with that -- if dynamically linked binaries are
being used then it should be just as OK for them to load new locale
handling code from a pre-configured path (or paths) as it is for them to
load libc itself from the same pre-configured path(s).  However....


I'm not nearly well enough versed in all the requirements for full i18n,
nor in the existing best practices, but it would seem to me from the
current discussion that some interfaces might best be kept out of libc
and be introduced in some separate library.  This library would
presumably be dependent only on libc (and not the other way around, of
course), and as such it could, if I understand correctly, be updated
independently.  Thus for dynamic binaries simply updating it could
provide new methods for the likes of getdate() while still not requiring
that the library be able to use dlopen() to also load new dynamic code
on its own (and obviously not requiring re-compliation or relinking of
the dynamically linked binaries using it); and static binaries would be
"frozen" at compile and/or link time with existing methods, or
alternately even compiled/linked with only the basic equivalent of
LC_ALL=C methods if space is percieved at compile time to be an issue in
the target environment.  Whether the latter is best done with a
"-lnolocale" stub library or even some preprocessor flag (eg. even the
logical equivalent of "-Dsetlocale=") is not something I can offer a
concrete opinion on at the moment.

In other words I still see no reason why dlopen() needs to be used to
provide run-time extensibilty of i18n methods, at least not for
dynamically linked binaries.  Are the dependencies between these methods
and libc circular in some serious unstoppable way if they were to be
removed from libc?

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>