tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: AES leaks, cgd ciphers, and vector units in the kernel



Taylor R Campbell <riastradh%NetBSD.org@localhost> writes:

>> What I meant is: consider an external USB disk of say 4T, which has a
>> cgd partition within which is ffs.
>> 
>> Someone attaches it to several systems in turn, doing cgd_attach, mount,
>> and then runs bup with /mnt/bup as the target, getting deduplication
>> across systems.
>
> (Side note: as a matter of architecture I would recommend
> incorporating the cryptography into the application, like borgbackup,
> restic, or Tarsnap do -- anything at a higher level than disks (even
> at the level of the file system, like zfs encryption) has much more
> flexibility and can also provide authentication.  Generally the main
> use case for disk encryption is to enable recycling disks without
> worrying about information disclosure; the threat model and security
> of disk encryption systems are both qualitatively very weak.)

Sure, but this is about doing something that is really reliable about
getting data back for disaster recovery, simplicity, only using tools
that have existed for a long time.  (You can't run zfs on old systems,
and borgbackup has had enough stability issues that I wouldn't trust
it.)

>> So, using the new faster cipher won't work, because it's not supported
>> by the older systems.
>> 
>> Hoewver, if the -current system does AES slowly because it has the new
>> constant-time implementation, and the older ones do it like they used
>> to, I don't see a real problem.
>
> OK.  If you encounter a scenario where this is likely to be a real
> problem, let me know.

From my viewpoint, a 3x slowdown, but with 100% reliablity is not a big
deal.

> I drafted an SSE2 implementation which considerably improves on the
> BearSSL aes_ct implementation on a number of amd64 CPUs I tested from
> around a decade ago.  It is still slower than before -- and AES-CBC
> encryption hurts by far the most, because it is necessarily
> sequential, whereas AES-CBC decryption and AES-XTS in both directions
> can be vectorized -- but it does mitigate the problem somewhat.  This
> covers all amd64 CPUs and probably most `i386' CPUs of the last 15-20
> years.
>
> There is some more room for improvement -- SSSE3 provides PSHUFB which
> can sequentially speed up parts of AES, and is supported by a good
> number of amd64 CPUs starting around 14 years ago that lack AES-NI --
> but there are diminishing returns for increasing implementation and
> maintenance effort, so I'd like to focus on making an impact on
> systems that matter.  (That includes non-x86 CPUs -- e.g., we could
> probably easily adapt the Intel SSE2 logic to ARM NEON -- but I would
> like to focus on systems where there is demand.)

That sounds good.

> I drafted a couple programs to approximately measure performance from
> userland.  They are very naive and do nothing to measure overhead from
> cgd(4) or disk i/o itself.
>
> https://www.NetBSD.org/~riastradh/tmp/20200621/aestest.tgz
> https://www.NetBSD.org/~riastradh/tmp/20200622/adiantum.tgz

Thanks - will try them.

>> So it remains to make userland AES use also constant time, as a separate
>> step?
>
> Correct.

ok - and helpful details from nia@ noted.


Home | Main Index | Thread Index | Old Index