Subject: Re: fs transcoding, was Re: Unicode support in iso9660.
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 11/25/2004 17:22:41
[ On Monday, November 22, 2004 at 22:28:50 (-0500), der Mouse wrote: ]
> Subject: Re: fs transcoding, was Re: Unicode support in iso9660.
>
> > I just checked SUSv3.  It says nothing particularly useful.
> > 
> > "For a filename to be portable across implementations conforming to
> >  IEEE Std 1003.1-2001, it shall consist only of the portable filename
> >  character set as defined in Portable Filename Character Set.
> 
> That's very interesting information.  But it makes me ask, does the SUS
> specify any particular encoding scheme for converting those characters
> into addressing units, or is the encoding left unspecified?

Well, character encoding has always been explicitly left unspecified in
POSIX and its derivatives in order, IIRM and IIUC, to accomodate the
liked of EBCDIC and other similar non-ASCII based systems.

POSIX is, after all, an API specification, not a data format or data
interchange specification.

There are relevant data interchange specifications though, including
specifically ISO-9960 (aka ECMA-119) itself.  :-)

According to ECMA-119 the "characters in the descriptors shall be coded
according to ECMA-6" (which is of course ASCII).  Note though that
further restrictions on the usable characters result in only a tiny
subset of ASCII being valid for filenames on a true ISO-9660 compliant
filesystem (e.g. no lower-case chars).

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>