NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: lib/57064: Import OpenBSD's script to autogen Unicode ctype definition?
The following reply was made to PR lib/57064; it has been noted by GNATS.
From: Joerg Sonnenberger <joerg%bec.de@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: lib-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost
Subject: Re: lib/57064: Import OpenBSD's script to autogen Unicode ctype
definition?
Date: Tue, 18 Oct 2022 20:15:21 +0200
Am Tue, Oct 18, 2022 at 04:30:01AM +0000 schrieb rokuyama.rk%gmail.com@localhost:
> Unicode has added thousands characters per year in a totally
> unorganized ways. Our ctype definition for UTF-8 has been left
> untouched in the last decade, with very few exceptions:
I had a Python script for converting the CLDR definitions directly to
the full locale descriptions, but I misplaced that in recent years it
seems.
> Also note that switch to OpenBSD's ctype definition of UTF-8 does
> *not* completely resolve our problems related to UTF-8. Our Citrus
> locale does not recognize combining characters (incl. variation
> selectors). Such characters may confuse applications.
That's not really surprising. Combining characters don't fit the classic
model of ISO C very well. They confuse a lot of software that makes poor
assumptions like every glyph corresponds 1:1 with a unicode code point
etc. But that's beyond the scope.
Joerg
Home |
Main Index |
Thread Index |
Old Index