NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: PR/57798 CVS commit: src/usr.bin/mklocale



The following reply was made to PR lib/57798; it has been noted by GNATS.

From: Valery Ushakov <uwe%stderr.spb.ru@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: PR/57798 CVS commit: src/usr.bin/mklocale
Date: Fri, 29 Dec 2023 00:16:12 +0300

 On Thu, Dec 28, 2023 at 03:50:01 +0000, Rin Okuyama wrote:
 
 >  It was implemented with an assumption that all digit characters
 >  can be mapped to numerical values <= 255.
 
 Unicode has three different "numeric" values for a character
 
 Unicode Character Database
 https://unicode.org/reports/tr44/
 
   Numeric_Value is extracted based on the actual numeric value of the
   data in field 8 of UnicodeData.txt or the values of the
   kPrimaryNumeric, kAccountingNumeric, or kOtherNumeric tags, for
   characters listed in the Unihan data files.
 
   Numeric_Type is extracted as follows.  If fields 6, 7, and 8 in
   UnicodeData.txt are all non-empty, then Numeric_Type=Decimal.
   Otherwise, if fields 7 and 8 are both non-empty, then
   Numeric_Type=Digit.  Otherwise, if field 8 is non-empty, then
   Numeric_Type=Numeric.  For characters listed in the Unihan data
   files, Numeric_Type=Numeric for characters that have
   kPrimaryNumeric, kAccountingNumeric, or kOtherNumeric tags.  The
   default value is Numeric_Type=None.
 
 The intention of TODIGIT is likely to be able to eventually provide
 support for something like LC_TIME's alt_digits or glibc printf(3)
 extension that provides 'I' modifier for %d and friends - that use
 locale-specific digits, say u+0f20..u+0f29 for Tibetan/Dzongkha
 locales.
 
 But I don't really know much about those areas of locales...
 
 -uwe
 


Home | Main Index | Thread Index | Old Index