tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Restricting rdtsc [was: kernel aslr]



Answering to each of your questions in one mail, with a few notes at the end.
First of all, it is just a wild idea from when I was in the train the other
day, and I haven't written any code for it. Then:

Le 28/03/2017 à 18:01, Mouse a écrit :
(1) Please provide a kernel build option to remove the restriction.
[...]

My original plan was to use a sysctl - as suggested by Manuel. One to
enable/disable the feature, another to log the segfaults.

Le 28/03/2017 à 18:01, Mouse a écrit :
(2) Does that actually help, or does it just compel the attacker to use
cruder timers and thus longer test runs?  (Or is that enough difference
that you believe it would actually help in practice?)

It does help, and that's the conclusion of most papers. There is however
another technique (software clock) to try to compute the number of cycles an
operation takes, but the resulting accuracy is very low, and not sufficient
to detect cache misses via latency.

Le 28/03/2017 à 18:30, David Young a écrit :
Why do you single out the rdtsc instruction instead of other time
sources?

Because of accuracy. As far as the papers point out, detecting cache misses
implies having a precision of at least ~50 cycles, and only rdtsc offers
this precision. Syscalls and other software-based timers have a non-
deterministic overhead that is bigger than ~50 cycles, and it therefore
pollutes the relevant information.

Le 28/03/2017 à 18:30, David Young a écrit :
What do you mean by "legitimately" use rdtsc?  It seems to me that it
is legitimate for a user to use a high-resolution timer to profile some
code that's under development.  They may want to avoid running that code
with root privileges under most circumstances.

Just like you said, that some users need to profile some code they develop.
They may indeed want to avoid running their tests with root privileges, and
that's where the sysctl is useful - they can disable the feature if they
want to.

A few notes now. In fact, the rdpmc instruction can also be used for side-
channel attacks, but we don't enable it currently so it does not matter.

Regarding serialization, I may not have been clear enough too. rdtsc is not
serializing, which means that it does not wait for the previous instructions
to execute completely before being executed. To compensate for that the
user needs to first execute a serializing instruction like cpuid, and right
after that put the rdtsc. With the fault approach, serialization is ensured,
because when returning to userland 'iret' is used, which is serializing. So
we have a 'iret+rdtsc', which has the same effect as 'cpuid+rdtsc'.

Also, a detail about my remark on accuracy. The basic use case for rdtsc is
the following:
	start = rdtsc
	work
	end = rdtsc
	elapsed = end - start
Here, we will fault on the first rdtsc; so the kernel will be entered, and
many cycles will be consumed there. But it does not matter, since the first
rdtsc is used as the starting point, and we don't care about adding cycles
before it. Therefore, the number of elapsed cycles is the same, with and
without the feature.

Finally, I'll add that there are other mitigations available on rdtsc, which
consist for example in adding a random (small) delta to the counter directly,
in order to fuzz the results. But then there is the problem of how big this
delta needs to be: big enough to mitigate side-channels, small enough to
still give relevant - yet a little inaccurate - information back to userland.


Home | Main Index | Thread Index | Old Index