Subject: Re: Unicode support in iso9660.
To: Jason Thorpe <thorpej@shagadelic.org>
From: SODA Noriyuki <soda@sra.co.jp>
List: tech-kern
Date: 11/20/2004 04:57:01
>>>>> On Fri, 19 Nov 2004 07:48:49 -0800,
	Jason Thorpe <thorpej@shagadelic.org> said:

> Perhaps for existing installations that are broken (as Soda says :-), a 
> mount option to override the encoding could be used... but I think just 
> for simplicity (and, thus, sanity) you have to pick something and 
> standardize on it.

iconv-like interface isn't so complex, either.
And iconv-like interface can solve problems that standardized codeset
cannot solve. Thus, iconv-like interface is the way to go.

Also, allowing multiple codeset in a single filesystem is not only
needed for compatibility, but also *useful* in this broken world.

Assume you have two tar files, one tar file was created with
ISO-8859-1 as its pathname codeset, and the other tar file was
created with EUC-JP as its pathname codeset, and assume you want
UTF-8 as your filesystem codeset.

Under a filesystem which doesn't care pathname codeset like current
ffs, you can just untar the tar files, and can convert the pathnames
to UTF-8.
But under a filesystem which only uses UTF-8 as its pathnames, you
may not be able to untar those tar files even, because both tar files
may contain pathnames which cannot be allowed as UTF-8.
This means tar must have a feature which converts its pathname
codeset. But not only tar, but also all archive programs should have
such codeset conversion feature, really. But that is practically
impossible.

Filesystems which don't care pathname codeset is sometimes rather
useful in this broken world with some cases like above.
So, it's nice if we can choose a filesystem which doesn't care
its pathname codeset. But also it's nice if one can force standardized
codeset for all filesystems.
iconv-like interface can achieve both requirement. That's the reason
why I think iconv-like interface is the way to go.
--
soda