Subject: Re: useradd: spaces and $ in usernames
To: None <tech-userlevel@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: tech-userlevel
Date: 11/16/2001 13:31:33
[ On Thursday, November 15, 2001 at 21:27:35 (-0500), der Mouse wrote: ]
> Subject: Re: useradd: spaces and $ in usernames
>
> > Usernames [] have had a specific character limitation for a Very Long
> > Time,
> 
> Where is this restriction documented, and what exactly _is_ the
> restriction?  I've been hanging around UNIX variants for long enough
> that I'd've expected to have picked up on it, but I don't recall
> anything of the sort.  I've seen "alphanumeric, ., _, -"; does that
> include non-ASCII letters, and if so, in which charset?  (And if not,
> do _you_ want to be the one who tells Søren "I'm sorry, Hels can have
> username hels, but you can't have username søren, because our OS is
> stuck in an English-centric view of the world? :-)

The original limitation of what characters were viable in usernames was
defined and enforced by getty, and by the properties of the tty and
serial port, modem, etc., that you might use to login with.

For example in AT&T UNIX System III the getty(8) manual page states:

       The user's login name is terminated by a new-line or  car-
       riage-return  character.  The latter results in the system
       being set to treat carriage returns appropriately.  If the
       login name contains only upper-case alphabetic characters,
       the system is told to map any future upper-case characters
       into the corresponding lower-case characters.

The following definition from passwd(4) on AT&T UNIX System V Release 4:

       login_name     is the name specified by the user when log-
                      ging  in.  This field contains no uppercase
                      characters, should not be more  than  eight
                      characters  long,  and  should begin with a
                      non-numeric character (that is, any  alpha-
                      betic or special character except colon).

Further I'll also note that some MTAs, including at least Postfix,
Smail, and IIRC Exim too, rely on the above definition to properly
support case-insensitive matching of mailbox names to usernames.  I
believe Postfix, and perhaps Exim, give you enough rope to allow
uppercase characters in usernames, but they still do case-insensitive
matching so you've got to do duplicate username detection with the same
algorithm.  Smail, by default, requires all lower-case usernames as it
explicitly folds the mailbox name to lowercase when it hands it off to
mail.local.  This can be changed by the administrator, even while
keeping the "ignore_case" setting for matching usernames; or all folding
can be turned off to require exact macthes.

Personally I don't see any problem with allowing ISO-8859-x high-bit
non-control characters (or any other encoding with similar properties),
just so long as you don't ever allow those users to use a 7-bit tty.

However I would very strongly recommend against allowing any shell
meta-characters in usernames.

No matter what you do though you I don't think you want to allow
all-uppercase usernames.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>