[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: misc/51470 UTF-8 not support Russian
The following reply was made to PR misc/51470; it has been noted by GNATS.
From: Michael van Elst <mlelstv%serpens.de@localhost>
Subject: Re: misc/51470 UTF-8 not support Russian
Date: Fri, 11 Nov 2016 20:27:44 +0100
There are deficiencies in UTF-8 support, but Russian is no special
The system itself is pretty agnostic and treats filenames just as
a byte sequence with special meaning to the values 47 (slash) and
zero. Interpreting that byte sequence as some codepage or as UTF-8
is a matter of convention.
The bourne shell (/bin/sh) didn't handle input bytes with bit 7 set
because that was used internally by the parser. The C-shell can
handle 8-bit filenames but wouldn't understand a utf-8 environment
in NetBSD-6, NetBSD-7 is fine. Other shells, including the current
bash (4.3.0) from pkgsrc don't have that problem and the native
/bin/sh has been fixed in NetBSD/-current.
VFAT stores long filenames in 16bit unicode. NetBSD would ignore
that and use only the lower byte of each character. This allowed
arbitrary byte sequences in filenames but is incompatible with Windows.
NetBSD/-current can translate between the 16bit unicode data on
disk and UTF-8.
The vi editor gained wide character support in NetBSD/-current
and you can now edit utf-8 text files with it.
NetBSD locale support is limited, but LC_CTYPE shouldn't differ
between the various languages when encoding is UTF-8. Using ru_RU.UTF-8
Michael van Elst
"A potential Snark may lurk in every tree."
Main Index |
Thread Index |