Re: bin/57544: sed(1) and regex(3) problem with encoding

> This whole "i18n" and "l10n" is a nightmare---and this is a not
> english native speaker who writes it...

And as a native anglophone - who knows a smattering of assorted other
languages - I agree.

I just recently ran into an occasion where something actually got me to
send mail to a domain whose mail was hosted by Google.  I sent it as
8859-14, because it involved a small amount of text in one of the
Gaelic dialects and I prefer to use seanċló when I can.

The text included a ċ.  But apparently, despite my marking it as
8859-14, by the time it got displayed (in their webmail interface, I
think), it had been converted into U+0104, LATIN CAPITAL LETTER A WITH
OGONEK, rather than the correct mapping, U+010B, LATIN SMALL LETTER C

So I sent a test mail, containing each of the accented vowels and each
of the dotted consonants (well, most of them; I forgot Ṫ and ṫ, but
that's minor).

That mail, for all that it was also marked as being 8859-14, got
displayed as if it were 8859-1.

Not even Google, apparently, can get it even vaguely right.

