Subject: Re: Unicode support in kernel
To: None <dolecek@ics.muni.cz>
From: Noriyuki Soda <soda@sra.co.jp>
List: tech-kern
Date: 10/15/1999 18:13:43
> > Yes. I'd like to avoid it, but...
> >
> > At least, Long filename extenstion to MS-DOS FAT filesystem requires
> > Unicode support in kernel, since it encodes filename as both UCS-2 and
> > codepage-dependent-codeset. Thus, for example, if userland specified
> > a filename as Shift_JIS, kernel has to translate it to Unicode, and
> > the reverse is also true.
>
> Really ? I didn't know it. So even windows 95 store the long
> filenames on FAT fs in Unicode too ?
Yes.
> But there is no mention of Unicode in our sys/msdosfs/ AFAICS :(
Please look at sys/msdosfs/msdosfs_conv.c:unix2winfn().
It stores 2 bytes per character (i.e. UCS-2), although it handles
only ISO-8859-1.
> The other thing I thout about like a cool feature to have is
> a way to have the filenames translated from arbitrary
> codeset to the codeset of the system (or process). You would
> just specify the native codeset of the mounted volume
> in the mount time and the filenames would be translated
> transparently. *daydream* ;-)
I agree, see previous my message :-)
> So my conclusion:
> The approach with just a mount option telling the codeset the
> Unicode filenames should be recoded into has it's shortcomings,
> but it much better than we do have now - or, better said, we don't
> have now ;-) It may be somewhat crude, but seems like sufficient enough
> to be used until "proper" solution is though out and implemented.
> If no one would complain very much shortly, I'll finish what I have
> now and commit it into tree.
As mentioned previous my message, NTFS should not have mount option
for codeset.
For shortterm solution, system wide option for user preferred codeset,
and per mount option for filesystem native codeset for msdosfs and
cd9660fs (but not NTFS) is way to go, IMHO.
--
soda