NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: misc/51470 UTF-8 not support Russian



The following reply was made to PR misc/51470; it has been noted by GNATS.

From: Michael van Elst <mlelstv%serpens.de@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: misc/51470 UTF-8 not support Russian
Date: Fri, 11 Nov 2016 20:27:44 +0100

 There are deficiencies in UTF-8 support, but Russian is no special
 case.
 
 The system itself is pretty agnostic and treats filenames just as
 a byte sequence with special meaning to the values 47 (slash) and
 zero. Interpreting that byte sequence as some codepage or as UTF-8
 is a matter of convention.
 
 The bourne shell (/bin/sh) didn't handle input bytes with bit 7 set
 because that was used internally by the parser. The C-shell can
 handle 8-bit filenames but wouldn't understand a utf-8 environment
 in NetBSD-6, NetBSD-7 is fine. Other shells, including the current
 bash (4.3.0) from pkgsrc don't have that problem and the native
 /bin/sh has been fixed in NetBSD/-current.
 
 VFAT stores long filenames in 16bit unicode. NetBSD would ignore
 that and use only the lower byte of each character. This allowed
 arbitrary byte sequences in filenames but is incompatible with Windows.
 NetBSD/-current can translate between the 16bit unicode data on
 disk and UTF-8.
 
 The vi editor gained wide character support in NetBSD/-current
 and you can now edit utf-8 text files with it.
 
 NetBSD locale support is limited, but LC_CTYPE shouldn't differ
 between the various languages when encoding is UTF-8. Using ru_RU.UTF-8
 is fine.
 
 
 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv%serpens.de@localhost
                                 "A potential Snark may lurk in every tree."
 


Home | Main Index | Thread Index | Old Index