[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: IDN hostname resolution in NetBSD
On May 27, 2010, at 3:59 AM, Johnny Billquist wrote:
>> Following up on this late part of the question, it looks like UTF-8 is bad
>> for local hostnames. Many things get confused, not the least of which is
>> wsconsole. But sshd seems to generate keys without understanding it as
>> UTF-8, and postfix harfs, etc etc.
>> Setting it to a latin1 hostname, including an ö, seems to almost work, or
>> at least show the right thing on boot, but postfix still harfs. Looks like
>> postfix can't handle anything other than 7-bit ascii in it's retrieved
>> And, when using latin1, zsh (4.3.10, with a UTF-8 $LANG) screws up pretty
>> badly too, as one might expect.
>> So, I think this may be "a bad idea". :-(
> I'm confused. If you use a Latin1 character, how do you expect something
> which is trying to Grok UTF-8 to be able to parse that? They are not
> compatible or interchangeable. Actually, Latin1 is a character set, while
> UTF-8 is a character encoding form for the Unicode character set.
Right. I'm sorry if it seemed that I didn't expect much of what I saw, I was
just giving a description of what I tried, and what the outcome was. I am not
at all surprised that the 8-bit ASCII (latin1) hostname was displayed wrong in
anything that expected data to be UTF-8. I did expect that, I was just noting
> Personally, I stay far away from UTF-8 whenever I can. It's not a good
> solution, the only problem being that it's now the standard, so no other
> better solution is going to come along. :-(
> (Actually, Unicode is part of the problem, but that is here to stay as well.)
I take the opposite view. Well, not opposite in UTF-8/Unicode being bad, I
don't actually have an opinion there, because I don't know of a better
solution. I'm just happy that finally, after decades of near impossibility in
inter-operability with respect to non-7-bit-ASCII characters, it's almost
possible to use accented/foreign characters and expect that most
people/programs will understand them. This is the benefit of UTF-8 in my
opinion. But as you note, that's only because it's the standard, not because
it's at all of a good design.
> Latin1 works better for almost anything, as long as you don't need characters
> not represented there, at which point it becomes useless (obviously).
Which is the reason why latin1 can't solve the bigger problem. It does work
for me in the case of the character(s) I want to use in this case, and I
suppose I could try to configure this machine to operate in latin1, but as all
of my other machines and operating systems pretty universally operate as UTF-8
(the aforenoted "standard"), it would be confusing, at the least.
One of my favorite quotes from years past: "That's the wonderful thing about
UNIX standards: So many to choose from."
Main Index |
Thread Index |