Subject: Re: Unicode support in iso9660.
To: Valeriy E. Ushakov <uwe@ptc.spbu.ru>
From: Jaromir Dolecek <jdolecek@NetBSD.org>
List: tech-kern
Date: 11/15/2004 18:36:15
Valeriy E. Ushakov wrote:
> It's obviosuly a complex matter, but in the interim *some* stop gap
> solution is necessary.  99% of non-ascii cds 99% of our users have to
> deal with are in a single encoding.  I'd be quite happy with a fixed

Good stop-gap measure would be present the (Joliet) Unicode names
as UTF-8 for userland (as done by NTFS) by default, intead of using
the current poor filter to ASCII.  GTK2 apps decode UTF-8 filenames
and show these properly (just checked with Gimp 2.0.6 and czech
file names on NTFS, works perfect). QT does not however.

Per-mount file name translation to pure 8bit encoding would be still
useful of course. It could be implemented fairly easily even without
any iconv-like functionality in kernel - mount_* can build the
translation table in userland using iconv(3), and pass it as one
of mount arguments.  I think there is some code in smbfs mount for
this, tho it's disabled on NetBSD ATM.

> In general, lack of LC_COLLATE and iso9660 file name translation
> literally *kills* NetBSD acceptance here:

Full LC_COLLATE would be really nice.
 
> - home use: What, I can't mount my mp3 cd with Russian names?  Good bye.
> 
> - server use: What, I can't get the names of my clients in the
>   database sorted alphabetically?  Are you f*ing kidding?

Most databases do not depend on system collate support for this.
At least MySQL and PostgreSQL does not.
 
> PS: And just don't get me started on wscons vs 8-bit encodings.

Yeah :/

Jaromir 
-- 
Jaromir Dolecek <jdolecek@NetBSD.org>            http://www.NetBSD.cz/
-=- We should be mindful of the potential goal, but as the Buddhist -=-
-=- masters say, ``You may notice during meditation that you        -=-
-=- sometimes levitate or glow.   Do not let this distract you.''   -=-