Re: [PATCH v6 11/10] x86/retpoline: Avoid return buffer underflows on context switch

From: Linus Torvalds
Date: Mon Jan 08 2018 - 19:58:27 EST


On Mon, Jan 8, 2018 at 4:44 PM, Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:
>
> Essentially the RSB are hidden registers, and the only way to clear them
> is the FILL_RETURN_BUFFER sequence. I don't see how clearing anything else
> would help?

Forget theory. Look at practice.

Let's just assume that the attacker can write arbitrarily to the RSB
state. Just accept it.

If you accept that, then you turn the question instead into: are there
things we can do to make that useless to an attacker.

And there the point is that even if you control the RSB contents, you
need to find something relevant to *put* in the RSB. You need to find
the gadget that makes that control of the RSB useful.

That's where clearing the other registers comes in. Particularly for
the shallow cases (maybe the attack involves calling a shallow system
call that goes to the scheduler immediately, like "pause()").

Finding those gadgets is already hard. And then you have to find them
in such a way that you can control some state that you can start
reading out arbitrary memory. So you need to not just find the gadget,
you need to pass in an interesting pointer to it.

Preferably a pointer you control easily, like one of the registers
that nobody used on the way from the attacking user space to the
scheduler() call. That's already going to be damn hard, but with the C
compiler saving many registers by default, it might not be impossible.

And THAT is where "clear the registers" comes in. It adds _another_
huge barrier to an attack that was already pretty hard to begin with.

If we clear the registers, what the hell are you going to put in the
RSB that helps you?

I really think that people need to think about the actual _practical_
side here. We're never ever going to solve all theoretical timing
attacks. But when it comes to something like Spectre, it's already
very much non-trivial to attack. When then looking at something like
"a few call chains deep in the scheduler", it got harder still. If we
clear registers and don't give the attacker a way to leak data, that's
yet another big barrier.

So instead of saying "we have to flush the return stack", I'm saying
that we should look at things that make flushing the return stack
_unnecessary_, simply because even if the attacker were to control it
entirely, they'd still be up shit creek without a paddle.

Linus