Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src



    Date:        Tue, 3 Sep 2019 04:07:16 +0000
    From:        Taylor R Campbell <campbell+netbsd-source-changes-d%mumble.net@localhost>
    Message-ID:  <20190903040716.A6ABE6035E%jupiter.mumble.net@localhost>

  | How do we clean it up?

  | I am not seeing a good way out of this.

I do, but you are all refusing to permit it ... simply abandon support
for case insensitive filesystems.

In practice (and I admit to not really having studied any of them, as
they're all a stupid idea) I doubt that any of them really are truly
case insensitive ... rather than are insenstive to the case of ascii
chars, and that's usually it.

Are there any of these filesystems that treat capital delta the same
as lower case delta?   Or even capital U with an umlaut the same as
lower case u with an umlaut?   (or E-acute and e-acute)   etc?

And if they do, how do they manage to do that?   The only way I can
imagine is to permit only 10646 (however encoded) characters in file
names, but if the file system is like that, how do we support it when
we have files on FFS (which has no such restriction) with characters
with the top bit set?   Is such a character 8859-1, 8859-7, 8859-11?

Those 3 have different methods of mapping their upper case chars to
their lower case equivs as I understand it (for 8859-11 it is easy,
there is no case, so none of the chars in the 0x80-0xff range are
equiv to each other regardless of whether the filesys is case insensitive
or not, but that would not, should not, be true for the other two
mentioned.)

Since none of the sys calls that end up invoking namei() pass locale
info to the kernel, and even if they did, I doubt we'd want to build
case mapping locale specific tables into the kernel (anyone's kernel)
and since I see no justification at all for doing special case handing
of English but not for all other languages, I'd suggest that any unix
filesystem that doesn't treat all filenames as simply bit strings, with
special handling only for the bytes 0x2F and 0x00 is simply broken, and
we should not attempt to support that at all.

Dealing with this kind of thing belongs in the applications, where locale
info is available (or can be), and where what is appropriate for treating
two different strings as if they represent the same thing can be dealt
with in a rational way - at least rational for that application.

So, to me, the solution is simple.   Leave it all as it is now, and
simply tell people using "case insensitive" file systems to either stop
doing that, or they simply will not be able to build everything.

kre



Home | Main Index | Thread Index | Old Index