Subject: Re: Unicode support in iso9660.
To: Martin Husemann <email@example.com>
From: Jaromir Dolecek <jdolecek@NetBSD.org>
Date: 11/19/2004 17:37:02
Martin Husemann wrote:
> What I understand is
> - the current cd9660 code is plain stupid wrong for the non-joliet
> case (should just pass out the stream of bytes, just like ffs)
Actually, cd9660 doesn't care about non-joliet names AFAICS
in the code.
> - the current cd9660 code is still wrong for juliet - it should
> transcode to utf-8 (I'm not realy sure you agree on this)
> - the latter aplies to msdosf long names
Right. UTF-8 representation is TRT generic presentation for
filesystems based on Unicode, such as NTFS, CD9660 with Joliet,
MSDOSFS long names.
> Or would you suggest to not encode the unicdoe names in the two latter cases
> into utf8 but something else? What intermediate representation are you
> talking about (I only see "must fit char* apis" as a intermediate
> representation - but I don't see how to avoid that).
The idea is that instead of having the kernel represent the names
to userland in utf-8, it would represent them in some other, 8-bit only
encoding, such as koi8-r or iso-8859-*. New file names (created files)
would be assumed to be in the 8-bit encoding and transcoded to Unicode
and on-disk format appropriately.
The default would be to present the names in utf-8 of course, but
a mount option to specify different 'presentation' file name encoding
would be very convenient for personal use for file systems which
are often shared between UNIX and Windows, such as cd9660 and
msdosfs. AFAICS transcoding support is not very useful for FFS,
where one can use whatever they want and there is no need
As the first step, do we all agree that Unicode names should
be presented in UTF-8 form (as done by NTFS), instead of mangling
them as done by cd9660 and msdosfs?
Jaromir Dolecek <jdolecek@NetBSD.org> http://www.NetBSD.cz/
-=- We should be mindful of the potential goal, but as the Buddhist -=-
-=- masters say, ``You may notice during meditation that you -=-
-=- sometimes levitate or glow. Do not let this distract you.'' -=-