IETF-SSH archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: additional core draft nits in need of WG attention.
What Niels said.
Passwords should be UTF-8 encoded. No stringprep or profile or even
plain Unicode normalization form should be required to be applied by
clients when the server processes a password, but for md5-digest type
algorithms where the client processes the password, then a stringprep
profile MUST be specified.
In the case of "password" and "keyboard-interactive" userauth it's clear
that the server processes the password so no stringprep profile is
needed for the password in "password" userauth or for replies to
"keyboard-interactive" userauth prompts.
Cheers,
Nico
On Mon, Nov 10, 2003 at 05:52:03PM +0100, Niels M?ller wrote:
> At last, the final comment in this series...
>
> Bill Sommerfeld <sommerfeld%east.sun.com@localhost> writes:
>
> > > >4. Section 5, last paragraph on page 9. Saying that UTF-8 is the
> > > >encoding for passwords means that implementations need to check for valid
> > > >UTF-8 encoding. This could lead to unexpected failures. It would be much
> > > >better to say that passwords are arbitrary binary strings with no
> > > >specified encoding. Exact match of the binary strings ought to
> > > >be sufficient.
> >
> > Thoughts? My understanding is that requesting exact match of
> > internationalized input is problematic under some circumstances..
>
> This is confused. For both usernames and passwords, the server have to
> massage the input into it's native format (unicode normalization form
> C, or latin-1, or whatever) before further processing. The further processing
> is then typically lookup in a database (for usernames) and "one-way
> encryption" for passwords. In both cases, the operation will typically
> fail if the input is not normalized.
>
> Treating usernames and passwords differently makes no sense. Either,
> we say that both usernames and passowrds are ascii-only (8-bit
> characters could be allowed, with implementation defined meaning, so
> that we get a royal mess where users will never know whether or not if
> 8-bit cahracters will work), or we say that both usernames and
> passwords are utf8.
>
> I've argued earlier that the sender of utf8 strings should be required
> to normalize them (unicode normalization form C, or nameprep, not sure
> what's most appropriate). But I didn't get much support for that, and
> then the server MUST do the right thing when converting the strings to
> its native format. And if the server does the right thing for
> usernames, there's no extra cost in doing the same for passwords.
>
> The goal of the utf8 use is to be able to support scenarios like this:
>
> * Unix server with usernames and passwords encoded in latin-1 in the
> /etc/passwd file, and running in a latin-1 locale.
>
> * Unix client, also in a latin-1 locale.
>
> * Windows Pocket PC client, which is a native unicode application
> and has never heard of latin-1.
>
> * The username "Åke Ärlig", which can be encoded in at least 6
> different equally correct ways in unicode as well as in proper
> utf8 without overlong sequences.
>
> If this doesn't Just Work, then the protocol is broken. And it seems
> we are asked to break it.
>
> /Niels
>
Home |
Main Index |
Thread Index |
Old Index