Subject: Re: Unicode support in iso9660.
To: Pavel Cahyna <pavel.cahyna@st.cuni.cz>
From: Allen Briggs <briggs@netbsd.org>
List: tech-kern
Date: 11/22/2004 11:07:30
On Mon, Nov 22, 2004 at 02:41:03PM +0100, Pavel Cahyna wrote:
> The GTK applications do need to display the filenames using X11. How to do
> that without assuming that filenames are encoded character sequences?

The user needs to be able to read and write meaningful filenames.
So the encoding / decoding has to happen somewhere.  The question
is, where?

Currently, the filesystem is agnostic.  As long as paths are
separated by '/' and end with a NUL character, the kernel doesn't
really care what the encoding is.  I think der Mouse's point is
that this is the way it should be--why should the kernel care what
the encoding is when it's essentially the province of userland to
make sense of the data.

In any event, a given piece of media will have filenames encoded
in some fashion, be it ASCII, UTF-8, or "other".  I don't see how
having the kernel know anything about the actual encoding would be
particularly practical.

The question is, how do you determine the encoding?  And where do
you want that knowledge to be?  My inclination is that it would be
much more flexible and expandable to have it in userland.  It would
be more uniform to have it in the kernel, but it's not clear to me
that the problem is well-enough defined yet for the kernel to do
the right thing.

Coming from scratch, I'd think that it would be best to use some
sort of i18n library to encode/decode paths to a known format (UTF-8
or whatever) from the filesystem's encoding.  This would have to
be initialized with the codeset information from the locale and/or
from the media.

-allen

-- 
                  Use NetBSD!  http://www.netbsd.org/