tech-userlevel: Re: finger

Subject: Re: finger
To: T.SHIOZAKI <tshiozak@astec.co.jp>
From: Kimmo Suominen <kim@tac.nyc.ny.us>
List: tech-userlevel
Date: 09/11/2002 09:15:52
| From:    "T.SHIOZAKI" <tshiozak@astec.co.jp>
| Date:    Tue, 10 Sep 2002 23:05:40 +0900
|
| > Removing all protection from non-printable characters seems also
| > quite against the recommendations in RFC-1288.  I was not trying
| > to add pass-through of control characters.
|
| OK, I have fixed it to conform to rfc1288 strictly in this point.
| Thanks for your comment.

I think you are trying to implement the second switch from the RFC
(print all characters), while I'm trying to implement the first one
(print high characters) together with the environment option.

| BTW, your way doesn't ensure to conform to this recommendation strictly,
| because of the principle of ISO C locale system.

What principle?  I seem to not understand why using isprint(3) is not
acceptable.  Please elaborate.

| > Therefore, I oppose to your proposed change.
|
| Hmm, I (itojun and others) pointed out the essential problems of
| your change.  On the other hand, any indications you mentioned seem
| not essential.

I beg to disagree.  What are exactly the problems?  I was first told it
is a security problem, because 8bit characters could be displayed on
displays handling multi-byte character sets in a way that causes garbage
to be displayed, and even causes security problems.

After asking several times about the security problems, Itojun pointed
me at this URL:

      http://www.google.co.jp/search?q=unicode+exploit&ie=UTF-8&oe=UTF-8&hl=ja&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=

It mostly reports a single Unicode conversion exploit in Microsoft
IIS.  I went through several pages, and found a couple of other items,
but they also discussed only problems with conversion from single-byte
character sets to Unicode.

So I did my best to fix the alledged security problem that was never
demonstrated in any way.

Now apparently isprint(3) and friends should not even be used, but it
has never been explained why.

What exactly is the agenda here?

| If you apply rfc1288 to our finger, you should not use setlocale()
| and any ctype.h functions (except for a few functions).
|
| While using ISO C locale system, we cannot touch any character code
| directly, except for a few cases.  However, rfc1288 is hard-wired to
| ASCII and mentions some concrete octet limits.  Thus, ISO C locale
| system conflicts with rfc1288.

Finger was already using isprint(3).  Please explain why this is not
a good approach?  I don't see why it would be preferable to define
a separate set of similar functions inside each program.

I prefer a safe way of viewing 8bit data.  Your patch does not provide
it at all.  Mine does.  Mine also does away with the alledged security
problems with multi-byte character sets.

I also find it important to have an environment variable for enabling
safe display of 8bit data.  LC_CTYPE/LANG are the natural way of doing
that.

+ Kim