Subject: Re: finger
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Martin Husemann <martin@duskware.de>
List: tech-userlevel
Date: 09/08/2002 09:52:00
On Sun, Sep 08, 2002 at 06:01:49AM +0200, der Mouse wrote:

> It is not clear.  The RFC (1288 is the latest finger RFC I find) is
> confused.  It says that "Any data transferred MUST be in ASCII format,
> with no parity", but then it goes on to talk about "characters between
> ASCII 128 and ASCII 255", apparently unaware that ASCII codes outside
> the range 0-127 simply _do not exist_.

So most participiants in this thread seemed to have interpreted this
unclearness as "the on-wire protocol uses ISO-8859-1".

Lucky me, this is accidently my client char set.

Kimmo: the server does not need to know the client charset, but it needs
to communicate the character encoding used to transfer the data.

Without this information, the client can not decode the characters and decide
which of it it can print - unless you fall back to the very simplistic 
assumption above: if it ain't ASCII it must be single byte encoded ISO-8859-1.

> I would say that any attempt to use octets outside 0-127 on the wire is
> ill-advised at best.

Well, to some degree it certainly is wrong, and the right way is to get a
protocol extension that comunicates the encoding used by the server to transfer
the data to the client. But to some degree (and for some people) the "interpret
it as ISO-8859-1" hack works.

What's wrong is to use isprint() and friends on the characters while using
the users locale settings (as the change itojun backed out did). You could
either hard code valid chars or force the locale to iso8859-1 before calling
isprint().


Martin