Subject: lib/33090: patch for new locales: ru_BY.CP1251, ru_RU.CP1251 and be_BY.CP1251
To: None <lib-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Aleksey Cheusov <cheusov@tut.by>
List: netbsd-bugs
Date: 03/16/2006 00:45:01
>Number:         33090
>Category:       lib
>Synopsis:       patch for new locales: ru_BY.CP1251, ru_RU.CP1251 and be_BY.CP1251
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 16 00:45:00 +0000 2006
>Originator:     Aleksey Cheusov <cheusov@tut.by>
>Release:        NetBSD 3.0_STABLE
>Organization:
>Environment:
	<The following information is extracted from your kernel. Please>
	<append output of "ldd", "ident" where relevant (multiple lines).>
System: NetBSD chen 3.0_STABLE NetBSD 3.0_STABLE (GENERIC) #2: Sun Mar 12 12:49:58 GMT 2006 cheusov@chen:/usr/src/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
	<precise description of the problem (multiple lines)>
>How-To-Repeat:
	<code/input/activities to reproduce the problem (multiple lines)>
>Fix:
	<how to correct or work around the problem, if known (multiple lines)>

Hi.
Can anybody apply a small patch to libc,
building a few new locales for Russian and Belarusian
languages?
I hope my patch didn't break anything.
Never touch libc (and NetBSD) before.
I tested these locales using ERE character classes
and tolower/toupper functions of awk.
Everything works correctly.

Patch is relative to $SRC/share/locale

Actually I just copy subpart of bg_BG.CP1251 file
to charset/CP1251 file and "#include" it
from new locates ctype/ files.

P.S.
What do such lines mean?
    CHARSET         ",A"
    CHARSET         "(I"
    CHARSET         "$(@"

All files in $SRC/share/locale/ctype/charset
contains this magic, my patch doesn't add it.
man mklocale says
   "CHARSET    Controls character set for subsequent runes"
The above magic symbols are not clear enough for me :-(

diff -urN ctype.orig/Makefile ctype/Makefile
--- ctype.orig/Makefile	2006-03-14 00:14:05.000000000 +0000
+++ ctype/Makefile	2006-03-13 23:34:49.000000000 +0000
@@ -9,6 +9,9 @@
 FILESGRP=	${LOCALEGRP}
 FILESMODE=	${LOCALEMODE}
 
+LOCALES += be_BY.CP1251
+ LOCALESRC_be_BY.CP1251 = be_BY.CP1251
+
 LOCALES += bg_BG.CP1251
  LOCALESRC_bg_BG.CP1251 = bg_BG.CP1251
 
@@ -189,9 +192,15 @@
 LOCALES += pt_PT.ISO8859-15
  LOCALESRC_pt_PT.ISO8859-15 = en_US.DIS_8859-15
 
+LOCALES += ru_BY.CP1251
+ LOCALESRC_ru_BY.CP1251 = ru_BY.CP1251
+
 LOCALES += ru_RU.CP866
  LOCALESRC_ru_RU.CP866 = ru_RU.CP866
 
+LOCALES += ru_RU.CP1251
+ LOCALESRC_ru_RU.CP1251 = ru_RU.CP1251
+
 LOCALES += ru_RU.KOI8-R
  LOCALESRC_ru_RU.KOI8-R = ru_RU.KOI8-R
 
diff -urN ctype.orig/be_BY.CP1251.src ctype/be_BY.CP1251.src
--- ctype.orig/be_BY.CP1251.src	1970-01-01 00:00:00.000000000 +0000
+++ ctype/be_BY.CP1251.src	2006-03-14 00:14:54.000000000 +0000
@@ -0,0 +1,11 @@
+/*
+ * LOCALE_CTYPE for Belarusian Cyrillic character set (CP1251)
+ */
+
+ENCODING	"NONE"
+VARIABLE        Belarusian Cyrillic character set (CP1251) by <vle@gmx.net>, CODESET=CP1251
+
+/*
+ * This is a comment
+ */
+#include "charset/CP1251"
diff -urN ctype.orig/bg_BG.CP1251.src ctype/bg_BG.CP1251.src
--- ctype.orig/bg_BG.CP1251.src	2006-03-14 00:14:05.000000000 +0000
+++ ctype/bg_BG.CP1251.src	2006-03-13 23:52:47.000000000 +0000
@@ -11,81 +11,4 @@
 /*
  * This is a comment
  */
-ALPHA           'A' - 'Z' 'a' - 'z'
-ALPHA           0x80 0x81 0x83 0x8a 0x8c - 0x90 0x9a 0x9c - 0x9f
-ALPHA           0xa1 - 0xa3 0xa5 0xa8 0xaa 0xaf 0xb2 - 0xb4 0xb8 0xba
-ALPHA           0xbc - 0xff
-CONTROL		0x00 - 0x1f 0x7f 0x98
-DIGIT		'0' - '9'
-GRAPH           0x21 - 0x7e 0x80 - 0x97 0x99 - 0x9f 0xa1 - 0xff
-LOWER           'a' - 'z' 0x83 0x90 0x9a 0x9c - 0x9f 0xa2 0xb3 0xb4 0xb8
-LOWER           0xba 0xbc 0xbe 0xbf 0xe0 - 0xff
-PUNCT           0x21 - 0x2f 0x3a - 0x40 0x5b - 0x60 0x7b - 0x7e
-PUNCT           0x82 0x84 - 0x89 0x8b 0x91 - 0x97 0x99 0x9b 0xa4
-PUNCT           0xa6 0xa7 0xa9 0xab - 0xae 0xb0 0xb1 0xb5 - 0xb7 0xb9 0xbb
-SPACE		0x09 - 0x0d 0x20 0xa0
-UPPER           'A' - 'Z' 0x80 0x81 0x8a 0x8c - 0x8f 0xa1 0xa3 0xa5 0xa8
-UPPER           0xaa 0xaf 0xb2 0xbd 0xc0 - 0xdf
-XDIGIT          '0' - '9' 'a' - 'f' 'A' - 'F'
-BLANK		' ' '\t' 0xa0
-PRINT           0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
-SWIDTH1         0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
-
-MAPLOWER       	<'A' - 'Z' : 'a'>
-MAPLOWER       	<'a' - 'z' : 'a'>
-MAPLOWER        <0x80 0x90>
-MAPLOWER        <0x81 0x83>
-MAPLOWER        <0x83 0x83>
-MAPLOWER        <0x8a 0x9a>
-MAPLOWER        <0x8c - 0x8f : 0x9c>
-MAPLOWER        <0x90 0x90>
-MAPLOWER        <0x9a 0x9a>
-MAPLOWER        <0x9c - 0x9f : 0x9c>
-MAPLOWER        <0xa1 0xa2>
-MAPLOWER        <0xa2 0xa2>
-MAPLOWER        <0xa3 0xbc>
-MAPLOWER        <0xa5 0xb4>
-MAPLOWER        <0xa8 0xb8>
-MAPLOWER        <0xaa 0xba>
-MAPLOWER        <0xaf 0xbf>
-MAPLOWER        <0xb2 0xb3>
-MAPLOWER        <0xb3 - 0xb4 : 0xb3>
-MAPLOWER        <0xb8 0xb8>
-MAPLOWER        <0xba 0xba>
-MAPLOWER        <0xbc 0xbc>
-MAPLOWER        <0xbd 0xbe>
-MAPLOWER        <0xbe - 0xbf : 0xbe>
-MAPLOWER        <0xc0 - 0xdf : 0xe0>
-MAPLOWER        <0xe0 - 0xff : 0xe0>
-
-MAPUPPER       	<'A' - 'Z' : 'A'>
-MAPUPPER       	<'a' - 'z' : 'A'>
-MAPUPPER        <0x80 - 0x81 : 0x80>
-MAPUPPER        <0x83 0x81>
-MAPUPPER        <0x8a 0x8a>
-MAPUPPER        <0x8c - 0x8f : 0x8c>
-MAPUPPER        <0x90 0x80>
-MAPUPPER        <0x9a 0x8a>
-MAPUPPER        <0x9c - 0x9f : 0x8c>
-MAPUPPER        <0xa1 0xa1>
-MAPUPPER        <0xa2 0xa1>
-MAPUPPER        <0xa3 0xa3>
-MAPUPPER        <0xa5 0xa5>
-MAPUPPER        <0xa8 0xa8>
-MAPUPPER        <0xaa 0xaa>
-MAPUPPER        <0xaf 0xaf>
-MAPUPPER        <0xb2 0xb2>
-MAPUPPER        <0xb3 0xb2>
-MAPUPPER        <0xb4 0xa5>
-MAPUPPER        <0xb8 0xa8>
-MAPUPPER        <0xba 0xaa>
-MAPUPPER        <0xbc 0xa3>
-MAPUPPER        <0xbd 0xbd>
-MAPUPPER        <0xbe 0xbd>
-MAPUPPER        <0xbf 0xaf>
-MAPUPPER        <0xc0 - 0xdf : 0xc0>
-MAPUPPER        <0xe0 - 0xff : 0xc0>
-
-TODIGIT       	<'0' - '9' : 0>
-TODIGIT       	<'A' - 'F' : 10>
-TODIGIT       	<'a' - 'f' : 10>
+#include "charset/CP1251"
diff -urN ctype.orig/charset/CP1251 ctype/charset/CP1251
--- ctype.orig/charset/CP1251	1970-01-01 00:00:00.000000000 +0000
+++ ctype/charset/CP1251	2006-03-13 23:23:11.000000000 +0000
@@ -0,0 +1,81 @@
+/*
+ * CP-1251
+ */
+ALPHA           'A' - 'Z' 'a' - 'z'
+ALPHA           0x80 0x81 0x83 0x8a 0x8c - 0x90 0x9a 0x9c - 0x9f
+ALPHA           0xa1 - 0xa3 0xa5 0xa8 0xaa 0xaf 0xb2 - 0xb4 0xb8 0xba
+ALPHA           0xbc - 0xff
+CONTROL		0x00 - 0x1f 0x7f 0x98
+DIGIT		'0' - '9'
+GRAPH           0x21 - 0x7e 0x80 - 0x97 0x99 - 0x9f 0xa1 - 0xff
+LOWER           'a' - 'z' 0x83 0x90 0x9a 0x9c - 0x9f 0xa2 0xb3 0xb4 0xb8
+LOWER           0xba 0xbc 0xbe 0xbf 0xe0 - 0xff
+PUNCT           0x21 - 0x2f 0x3a - 0x40 0x5b - 0x60 0x7b - 0x7e
+PUNCT           0x82 0x84 - 0x89 0x8b 0x91 - 0x97 0x99 0x9b 0xa4
+PUNCT           0xa6 0xa7 0xa9 0xab - 0xae 0xb0 0xb1 0xb5 - 0xb7 0xb9 0xbb
+SPACE		0x09 - 0x0d 0x20 0xa0
+UPPER           'A' - 'Z' 0x80 0x81 0x8a 0x8c - 0x8f 0xa1 0xa3 0xa5 0xa8
+UPPER           0xaa 0xaf 0xb2 0xbd 0xc0 - 0xdf
+XDIGIT          '0' - '9' 'a' - 'f' 'A' - 'F'
+BLANK		' ' '\t' 0xa0
+PRINT           0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
+SWIDTH1         0x20 - 0x7e 0x80 - 0x97 0x99 - 0xff
+
+MAPLOWER       	<'A' - 'Z' : 'a'>
+MAPLOWER       	<'a' - 'z' : 'a'>
+MAPLOWER        <0x80 0x90>
+MAPLOWER        <0x81 0x83>
+MAPLOWER        <0x83 0x83>
+MAPLOWER        <0x8a 0x9a>
+MAPLOWER        <0x8c - 0x8f : 0x9c>
+MAPLOWER        <0x90 0x90>
+MAPLOWER        <0x9a 0x9a>
+MAPLOWER        <0x9c - 0x9f : 0x9c>
+MAPLOWER        <0xa1 0xa2>
+MAPLOWER        <0xa2 0xa2>
+MAPLOWER        <0xa3 0xbc>
+MAPLOWER        <0xa5 0xb4>
+MAPLOWER        <0xa8 0xb8>
+MAPLOWER        <0xaa 0xba>
+MAPLOWER        <0xaf 0xbf>
+MAPLOWER        <0xb2 0xb3>
+MAPLOWER        <0xb3 - 0xb4 : 0xb3>
+MAPLOWER        <0xb8 0xb8>
+MAPLOWER        <0xba 0xba>
+MAPLOWER        <0xbc 0xbc>
+MAPLOWER        <0xbd 0xbe>
+MAPLOWER        <0xbe - 0xbf : 0xbe>
+MAPLOWER        <0xc0 - 0xdf : 0xe0>
+MAPLOWER        <0xe0 - 0xff : 0xe0>
+
+MAPUPPER       	<'A' - 'Z' : 'A'>
+MAPUPPER       	<'a' - 'z' : 'A'>
+MAPUPPER        <0x80 - 0x81 : 0x80>
+MAPUPPER        <0x83 0x81>
+MAPUPPER        <0x8a 0x8a>
+MAPUPPER        <0x8c - 0x8f : 0x8c>
+MAPUPPER        <0x90 0x80>
+MAPUPPER        <0x9a 0x8a>
+MAPUPPER        <0x9c - 0x9f : 0x8c>
+MAPUPPER        <0xa1 0xa1>
+MAPUPPER        <0xa2 0xa1>
+MAPUPPER        <0xa3 0xa3>
+MAPUPPER        <0xa5 0xa5>
+MAPUPPER        <0xa8 0xa8>
+MAPUPPER        <0xaa 0xaa>
+MAPUPPER        <0xaf 0xaf>
+MAPUPPER        <0xb2 0xb2>
+MAPUPPER        <0xb3 0xb2>
+MAPUPPER        <0xb4 0xa5>
+MAPUPPER        <0xb8 0xa8>
+MAPUPPER        <0xba 0xaa>
+MAPUPPER        <0xbc 0xa3>
+MAPUPPER        <0xbd 0xbd>
+MAPUPPER        <0xbe 0xbd>
+MAPUPPER        <0xbf 0xaf>
+MAPUPPER        <0xc0 - 0xdf : 0xc0>
+MAPUPPER        <0xe0 - 0xff : 0xc0>
+
+TODIGIT       	<'0' - '9' : 0>
+TODIGIT       	<'A' - 'F' : 10>
+TODIGIT       	<'a' - 'f' : 10>
diff -urN ctype.orig/ru_BY.CP1251.src ctype/ru_BY.CP1251.src
--- ctype.orig/ru_BY.CP1251.src	1970-01-01 00:00:00.000000000 +0000
+++ ctype/ru_BY.CP1251.src	2006-03-14 00:15:51.000000000 +0000
@@ -0,0 +1,11 @@
+/*
+ * LOCALE_CTYPE for Russian Cyrillic character set (CP1251)
+ */
+
+ENCODING	"NONE"
+VARIABLE        Russian Cyrillic character set (CP1251) by <vle@gmx.net>, CODESET=CP1251
+
+/*
+ * This is a comment
+ */
+#include "charset/CP1251"
diff -urN ctype.orig/ru_RU.CP1251.src ctype/ru_RU.CP1251.src
--- ctype.orig/ru_RU.CP1251.src	1970-01-01 00:00:00.000000000 +0000
+++ ctype/ru_RU.CP1251.src	2006-03-14 00:14:50.000000000 +0000
@@ -0,0 +1,10 @@
+/*
+ * LOCALE_CTYPE for Russian Cyrillic character set (CP1251), based on bg_BG.CP1251
+ */
+ENCODING	"NONE"
+VARIABLE        Russian Cyrillic character set (CP1251) by <vle@gmx.net>, CODESET=CP1251
+
+/*
+ * This is a comment
+ */
+#include "charset/CP1251"

-- 
Best regards, Aleksey Cheusov.

>Unformatted:
 	<Please check that the above is correct for the bug being reported,>
 	<and append source date of snapshot, if applicable (one line).>