Subject: Re: finger
To: None <tech-userlevel@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-userlevel
Date: 09/12/2002 12:11:23
>> Assuming that 8859-* printables are safe isn't right; the safe set
>> could be larger or smaller than that -
> I'd be interested to know about 'smaller' cases.

Converting to 7bit by stripping 0x80 turns 0xff into DEL, as I
specifically mentioned in the message you're responding to.

> Surely there are 'larger' ones (like windows-1250, koi8, euc-*, etc),

Um, KOI-8 is smaller, not larger: its printables, based on what little
info I've found for it, are 0xa3, 0xb3, and 0xc0-0xff.

> [...] I'm not aware of character set with control characters in range
> 161-255.

That doesn't make them safe.  I'm not aware offhand of any standard
that specifies non-printing semantics for that range, but (apparently
unlike you) I am not confident I am familiar with all relevant
standards, and I'm quite certain there is plenty of code out there that
gives those codes semantics not specified by any standard.  (For
example, code - or hardware! - that strips 0x80 bits will convert them
into low-half codes, thus giving them semantics not specified for the
original codes in any standard.)

And then there are things (like Lisp Machines) that will give octets
128-255 completely idiosyncratic semantics.  I'm *sure* I don't know
all of those, and I'd be astonished if none of them gave any sort of
control or escape meaning to any codes in 161..255.

>> - 142 (Single Shift2) and 143 (Single Shift3) should be allowed to
>>   support EUC codesets such as eucCN, eucTW, eucKR and eucJP.
>> - values 128-160 should be allowed to support Shift-JIS.
> 128-160 are control characters in iso-8859-*, so they are not safe to
> pass without character set protocol extension.

Why should *your* character set be the one that gets to define "safe to
pass" here?

>> Default should be defensive, shoudn't it?
> Yes.  Default of passing 33-127, 161-255 (in both finger and fingerd)
> is as defensive, interop-friendly and convenient as we can get.

Well, no.  Those three goals are to a certain extent at odds with one
another.  As defensive as we can get is to pass 32-126 (note, 32 not
33, and 126 not 127) plus CRLF pairs.  As interop-friendly as we can
get is not clear, but at a minimum would be 128-255, as soda points
out.  As convenient, well, convenient for whom?  As an obvious example,
convenient for Shift-JIS users can be pretty seriously inconvenient for
people using hardware or software based on ANSI X3.64.

I still stand by my earlier comment that any attempt to use 128-255 on
the wire is ill-advised at best.  I'll even go a little further: any
attempt to use anything but 32-126 plus CRLFs on the wire is
ill-advised at best.  (That's not to say I'd object to a knob admins
could use to make fingerd and maybe finger pass other codes; admins who
insist on shooting themselves in the foot might as well be allowed to.)

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B