Subject: Unicode support in kernel
To: None <>
From: Jaromir Dolecek <>
List: tech-kern
Date: 10/14/1999 12:08:10
since ntfs pretty much needs some Unicode support in kernel, I'm
going to integrate code heavily hacked from Motomichi Matsuzaki's
patches for FreeBSD Joliet Unicode support (his original patches
are available on The code
will be shared by both cd9660 & ntfs and be available for any other
filesystem to use.

The unicode support will be pulled in when either ntfs or cd9660
were included into the kernel, or options UNICODE would 
be in kernel config. The latter is necessary so that
it would be possible to load cd9660 and ntfs as LKM.

It's possible to specify the character set into which the Unicode
filenames will be translated. Always available is utf-8 and
I'll add iso-8859-1 probably. Other currently supported
encodings are iso-8859-2, koi8-r and euc-jp.

Currently, the code uses sysctl to set the preferred target
encoding. Since this is not quite that flexible (and would
mean that filename cache would have to be flushed for
all filesystems using the recoding engine on any change
of the characters set), I'm going to make the target
character set a mount option.

The thing I'm not sure about is where the files should
go. It's one unicode_subr.c file, two headers (unicode.h
and unicode_subr.h) and about 6 .c files implementing
the charset/encodings recoding. The proposals I got so far
were sys/lib/libunicode (or sys/lib/unicode) and
sys/miscfs/genfs/. I'd probably go for the latter,
but somehow I don't like it much - it's too deep
in directory structure and the Unicode recoding engine
is not strictly usable just for filesystems. I'd better
put the general interface .c file to sys/kern/kern_unicode.c,
headers would go into sys/sys/ and the other .c
files to, say, kern/unicode/ or something like that.

Any ideas, thoughts, comments?

Jaromir Dolecek <>
"The only way how to get rid temptation is to yield to it." -- Oscar Wilde