Re: A draft for a multibyte and multi-codepoint C string interface

To: tech-userlevel%NetBSD.org@localhost
Subject: Re: A draft for a multibyte and multi-codepoint C string interface
From: Mouse <mouse%Rodents-Montreal.ORG@localhost>
Date: Sun, 7 Apr 2013 15:55:32 -0400 (EDT)

>> It costs quite a lot, actually, because it would mean that
>> everything working with pathnames as other than opaque octet strings
>> has to be aware of its idiosyncracies, such as the normalization
>> rules, as I mentioned above.
> No.  Did you read Rob Pike and Ken Thompson' paper about UTF-8?  They
> decided to not take into account the way the characters may be
> rendered or collating sequences and so on.

That's fine for them, but that's not the position this thread started
out being about and it's not the position I've been arguing against.
The position I've been arguing against is the one that wants UTF-8
awareness, normalization in particular, in the kernel, or at least on
the called, not caller, side of open(2) and related calls.

If Pike and Thompson think the syscall interface should be opaque octet
strings, with UTF-8 awareness limited to userland, I agree with them.

> What I don't understand is [...] a proposal to put encodings
> considerations in the filesystems or in the file handling system
> calls.

I think I understand it.  I just think it's a wrong choice - or, since
none of the people involved in the discussion are stupid, it might be
more useful to say that I disagree with the relevant people (mostly
jkl, I think) about the relative values of the various things being
traded off.

> [...].  And I hate case insensitive filesystems, so I don't want an
> enforcement of some cryptic policy deciding that two distinct strings
> passed are, indeed, the same thing: this is a user decision, not a
> system one.

You and I are in furious agreement here.  But people - as I said,
mostly jkl I think - have been arguing that this is indeed something
that belongs in the kernel.

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Follow-Ups:
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Ken Hornstein
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: tlaronde

References:
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: James K. Lowden
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: James K. Lowden
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: James K. Lowden
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: James K. Lowden
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: tlaronde
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: tlaronde

Prev by Date: Re: A draft for a multibyte and multi-codepoint C string interface
Next by Date: Re: A draft for a multibyte and multi-codepoint C string interface
Previous by Thread: Re: A draft for a multibyte and multi-codepoint C string interface
Next by Thread: Re: A draft for a multibyte and multi-codepoint C string interface
Indexes:

Home | Main Index | Thread Index | Old Index