Subject: Re: Unicode support in iso9660.
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Jaromir Dolecek <jdolecek@NetBSD.org>
List: tech-kern
Date: 11/21/2004 16:30:03
der Mouse wrote:
> Ah, but UTF-8 does change that: it means that lots of octet sequences
> that were perfectly good under the previous paradigm (eg, 0xaa 0xaa
> 0xaa 0xaa 0xaa 0xaa 0xcc 0x42 - _______B if you're using 8859-1) are
> now invalid.

I mean - if a file name is encoded into UTF-8, encoding-agnostic app
can handle it the same way any non-UTF8 filename is handled, '/'
is still a slash and there is still just single 0x00 on the end.
i.e. UTF8 file names are fully compatible with the way UNIX systems
work.

The compatibility problem you describe is only present if application
interprets the file name it gets.

> Does POSIX say anything about whether the octet sequences seen by the
> application as pathname components may be implicitly paired with
> information such as the current locale when determining what other
> filenames they may match?  That is, if one program creates a file whose
> name consists of one "character", 0xaa, must another program opening
> the same name get the same file, or may it get a different file
> depending on other state (such as what locales the programs were/are
> running under)?

Yeah, it might be quite a problem that single file name would have
different file names with differrent locales, or that there could
appear to be duplicit directory entries if the current locale cannot
express some of the characters of the file name.

So IMHO it's not feasible to do fully transparent and automatic
transcoding, apps need to be aware of the original undecoded (UTF-8)
file name and use that for file system operations, store that
into config files etc.

How does Solaris handle this, do they have any kind of transparent
transcoding layer similar to what Jason and others proposed? Or do
they depend on applications handling it themselves?

Jaromir
-- 
Jaromir Dolecek <jdolecek@NetBSD.org>            http://www.NetBSD.cz/
-=- We should be mindful of the potential goal, but as the Buddhist -=-
-=- masters say, ``You may notice during meditation that you        -=-
-=- sometimes levitate or glow.   Do not let this distract you.''   -=-