Subject: opencrypto(9) API botch: HMAC sizes, IPsec vs. TLS vs. known-answer
To: None <tech-kern@netbsd.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 04/28/2004 20:06:51
(Note: cc'ed to Sam Leffler as a courtesy for the FreeBSD users of this API.
Forwarding it to a sensible person involved in OpenBSD's
crypto(9) might be a good idea.)

Someone, somewhere, mis-specified the size of MD5 and SHA1 HMAC
outputs from opencrypto(9) as being 96 bits. These sizes are wrong
(though they conincidentally match the 96-bit HMACs used by IPsec).
The true HMAC-MD5 and HMAC-SHA1 results are 128 and 160 bits,
respectively. This makes nasty mess of the API for any non-IPSec users
of the API who need all the HMAC bits.

This is a most vexing problem. First, it's now impossible to use
opencrypto(9) to compute HMAC-SHA1 for TLS; opencrypto(9) throws away
two-fifths of the hash bits, and apparently TLS uses all 160 of them.

Second, we've polluted our API namespace in a way that will be
difficult to recover from.  RFC-2104 suggests naming conventions for
HMACs that indicate truncation by a suffix. If we were to follow that
convention, the existing [open]crypto(9) HMAC operations should be
called something like CRYPTO_HMAC_MD5_96 and CRYPTO_HMAC_SHA1_96; with
the existin API values instead returning full-size, non-truncated results.

But wait, it gets worse.  At least in principle, hardware may support
full-sized HMACs as standalone operations, but allow only the
truncated HMAC for fused operations (that is, encrypt-then-generate-HMAC,
in one pass over the data; or a regenerate-HMAC-over-ciphertext-then-decrypt,
again in one pass over the data).

Surely the natural solution is to have (open)crypto(9) and crypto(4)
return all the HMAC bits which are computed the underlying hardware;
and let the client do any necessary truncation.  That means an API
change, and (if hardware supports a range of different sizes) it may
mean extending the implementation, to pass back an explicit size for
all HMAC results.

Its not clear to me whether or not drivers should attempt to hide this
from userspace (or from the crypto framework), by providing
backwards-compatible entrypoints. I also don't see what to do for
hardware which genuinely only supports the 96-bit truncated HMACs. or,
what to do for hardware that returns full-size HMAC results for a
standalone operation, but only return 96-bit HMAC results for fused
operations.

Futher questions: which size should IPsec ask for? Should IPsec try
the smaller, then fall back to the larger and do its own truncation?
Or should all drivers that can do non-truncated HMACs also register a
truncated version, and do the truncation in software?

This matters for accelerating TLS hashes; and it matters for
performing known-answer tests (which is something the system really,
*really* should be doing at startup, as a sanity check).

So.... anyone got any bright ideas on how to address the problem?