Subject: Re: Russian characters -> utf-8
To: Roland Illig <rillig@NetBSD.org>
From: Mike M. Volokhov <mishka@NetBSD.org>
List: netbsd-docs
Date: 10/13/2005 10:10:09
On Thu, 13 Oct 2005 06:09:30 +0200
Roland Illig <rillig@NetBSD.org> wrote:

> roland@baccf5ee:/tmp/roland/htdocs/ru/Releases/formal-1.6 > make
> [xsltproc] NetBSD-1.6.2.xml -> NetBSD-1.6.2.html
> [list2html] index.list -> index.html
> "\x{00d0}" does not map to utf8 at list2html.pl line 590.
> "\x{00b4}" does not map to utf8 at list2html.pl line 590.
> "\x{00d0}" does not map to utf8 at list2html.pl line 590.
> "\x{00b7}" does not map to utf8 at list2html.pl line 590.
> 
> What do these diagnostics want to tell me? Converting U+00d0 into an 
> UTF-8 format is trivial, so what's the problem here?

Source files for Russian site are KOI8-R encoded, but HTMLs to build
are all UTF-8. For *.list files covertion is made by htdocs/ru/list2html.pl
script contained the following line (57):

use open IN=>':encoding(koi8-r)', OUT=>':encoding(utf-8)';

So seems this is a perl related problem, although I can't reproduce it
on my system, sorry:

mishka@nostromo:48> pkg_info -I perl
perl-5.8.6nb4       Practical Extraction and Report Language
mishka@nostromo:49> pwd
/usr/home/mishka/NetBSD/htdocs/ru/Releases/formal-1.6
mishka@nostromo:50> make clean
/bin/rm -f Errs errs mklog NetBSD-1.6.2.html    index.html NetBSD-1.6.html NetBSD-1.6.1.html  NetBSD-1.6.2.html
mishka@nostromo:51> make
[xsltproc] NetBSD-1.6.2.xml -> NetBSD-1.6.2.html
[list2html] index.list -> index.html
[list2html] NetBSD-1.6.list -> NetBSD-1.6.html
[list2html] NetBSD-1.6.1.list -> NetBSD-1.6.1.html
mishka@nostromo:52> 

--
Mishka.