tech-kern: Re: Unicode support in kernel

Subject: Re: Unicode support in kernel
To: dolecek@ics.muni.cz, Noriyuki Soda <soda@sra.co.jp>
From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
List: tech-kern
Date: 10/15/1999 11:10:12

On Thu, Oct 14, 1999 at 08:26:57PM +0300, Jaromir Dolecek wrote:
> Noriyuki Soda wrote:
> > And doesn't handle multiuser case.
> 
> Right.
> 
> > It can be done in library level, or perhaps per process codeset
> > attribute in kernel. Thus don't have to change all userland.
> > (BTW, latter is.... mmmmm ;-))
> 
> I though about it a bit more and doing this per-process attribute
> in kernel would not be actually very hard (at least it doesn't seem
> to be :). Internally, the filenames would be kept in utf-8 and
> on every pass from/to kernel (open(), creat(), getdents() etc.)
> the filename would be recoded to/from the processes preferred vfs
> charset. The recoding might be even done on library level - if
> the preferred encoding would be in environment, that
> would mean just one more system call (to find out if the
> recoding is necessary for this particular filename). Ha, problem -
> what if ntfs volume would be mounted on some ffs directory ? In
> that case, part of the path would need to be recoded and part not.
> So the recoding would has to happen on namei() level :(

You are mixing two different en/decodings here, don't you?

On the user/kernel interface, the open() library routine calls a library
routine to do users-preferred-charset-to-utf8 encoding. 

The _kernel_ needs to do utf8-to-native-filesystem-encoding at the
namei level more or less, but thats a second and different encoding
step.

(always assuming utf-8 is the universal charset we like it to be!)

Regards,
	Ignatios
-- 
 * Progress (n.): The process through which Usenet has evolved from
   smart people in front of dumb terminals to dumb people in front of
   smart terminals.  -- obs@burnout.demon.co.uk (obscurity)