IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: additional core draft nits in need of WG attention.



What Niels said.

Passwords should be UTF-8 encoded.  No stringprep or profile or even
plain Unicode normalization form should be required to be applied by
clients when the server processes a password, but for md5-digest type
algorithms where the client processes the password, then a stringprep
profile MUST be specified.

In the case of "password" and "keyboard-interactive" userauth it's clear
that the server processes the password so no stringprep profile is
needed for the password in "password" userauth or for replies to
"keyboard-interactive" userauth prompts.

Cheers,

Nico

On Mon, Nov 10, 2003 at 05:52:03PM +0100, Niels M?ller wrote:
> At last, the final comment in this series...
> 
> Bill Sommerfeld <sommerfeld%east.sun.com@localhost> writes:
> 
> > > >4.  Section 5, last paragraph on page 9.  Saying that UTF-8 is the
> > > >encoding for passwords means that implementations need to check for valid
> > > >UTF-8 encoding.  This could lead to unexpected failures. It would be much
> > > >better to say that passwords are arbitrary binary strings with no
> > > >specified encoding.  Exact match of the binary strings ought to
> > > >be sufficient.
> >
> > Thoughts?  My understanding is that requesting exact match of
> > internationalized input is problematic under some circumstances..
> 
> This is confused. For both usernames and passwords, the server have to
> massage the input into it's native format (unicode normalization form
> C, or latin-1, or whatever) before further processing. The further processing
> is then typically lookup in a database (for usernames) and "one-way
> encryption" for passwords. In both cases, the operation will typically
> fail if the input is not normalized.
> 
> Treating usernames and passwords differently makes no sense. Either,
> we say that both usernames and passowrds are ascii-only (8-bit
> characters could be allowed, with implementation defined meaning, so
> that we get a royal mess where users will never know whether or not if
> 8-bit cahracters will work), or we say that both usernames and
> passwords are utf8.
> 
> I've argued earlier that the sender of utf8 strings should be required
> to normalize them (unicode normalization form C, or nameprep, not sure
> what's most appropriate). But I didn't get much support for that, and
> then the server MUST do the right thing when converting the strings to
> its native format. And if the server does the right thing for
> usernames, there's no extra cost in doing the same for passwords.
> 
> The goal of the utf8 use is to be able to support scenarios like this:
> 
>   * Unix server with usernames and passwords encoded in latin-1 in the
>     /etc/passwd file, and running in a latin-1 locale.
> 
>   * Unix client, also in a latin-1 locale.
> 
>   * Windows Pocket PC client, which is a native unicode application
>     and has never heard of latin-1.
> 
>   * The username "Åke Ärlig", which can be encoded in at least 6
>     different equally correct ways in unicode as well as in proper
>     utf8 without overlong sequences.
> 
> If this doesn't Just Work, then the protocol is broken. And it seems
> we are asked to break it.
> 
> /Niels
> 



Home | Main Index | Thread Index | Old Index