Re: [GIT pull] x86/asm for 5.1

From: Linus Torvalds
Date: Sun Mar 10 2019 - 17:43:37 EST


On Sun, Mar 10, 2019 at 4:33 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
>
> +extern volatile unsigned long cr4_pin;
> +
> static inline void native_write_cr4(unsigned long val)
> {
> + unsigned long warn = 0;
> +
> +again:
> + val |= cr4_pin;
> asm volatile("mov %0,%%cr4": : "r" (val), "m" (__force_order));
> + /*
> + * If the MOV above was used directly as a ROP gadget we can
> + * notice the lack of pinned bits in "val" and start the function
> + * from the beginning to gain the cr4_pin bits for sure. Note
> + * that "val" must be volatile to keep the compiler from
> + * optimizing away this check.
> + */
> + if ((val & cr4_pin) != cr4_pin) {
> + warn = ~val & cr4_pin;
> + goto again;
> + }
> + WARN_ONCE(warn, "Attempt to unpin cr4 bits: %lx; bypass attack?!\n",
> + warn);
> }

Yeah, no.

No way am I pulling a "security fix" that is that stupid, and that
badly documented. The comments are actively *not* matching the code,
and the actual code is so horribly written that it forces the compiler
to do extra stupid and unnecessary things (reloading that 'cr4_pin"
twice _after_ the thing you're trying to fix, for no actual good
reason - it only needs one reload).

The asm could trivially have just had a memory clobber in it, instead
of the wrongheaded - and incorrectly documented - volatile.

And btw, when you do the same thing twice, just do it the same way,
instead of having two different ways of doing it. There was a totally
different model to handling native_write_cr0() having similar issues.

Finally, I suspect this is all subtly buggy anyway, because your
'cr4_pin' thing is a global variable, but the cr4 bits are per-cpu,
and I don't see how this works with the different CPU's setting things
up one after each other, but the first one sets the pinned bits.

So it basically "pins" the bits on CPU's before they are ready, so the
secondary CPU's maghically get the bits set _independently_ of running
the actual setup code to set them. That may *work*, but it sure looks
iffy.

In particular, the whole problem with "the compiler might optimize it
away" comes from the fact that you do the "or" before even writing the
value, which shouldn't have been done anyway, but was forced by the
fact that you used a global mask for something that was really per-cpu
state to be set up (ie you *had* to set the bits wrong on the CPU
early, because you tested the wrong global bits afterwards),

If you had made the pinned bits value be percpu, all of the problems
would have gone away, because it wouldn't have had that bogus "val |=
cr4_pin" before.

So I think this code is
(a) buggy
(b) actively incorrectly documented
(c) inconsistent
(d) badly designed

and when you have code where the _only_ raison d'etre is security, you
had better have some seriously higher requirements than this.

IOW, the code probably should just do

DECLARE_PER_CPU_READ_MOSTLY(unsigned long, cr4_pin);

static inline void native_write_cr4(unsigned long val)
{
for (;;) {
unsigned long pin;
asm volatile("mov %0,%%cr4": : "r" (val):"memory");
pin = __this_cpu_read(cr4_pin);
if (likely((val & pin) == pin)
return;
WARN_ONCE("Attempt to unpin cr4 bits: %lx; bypass
attack?!\n", pin & ~val);
val |= pin;
}
}

which is a lot more straightforward, doesn't have that odd data
pollution between CPU's, and doesn't need to play any odd games.

And no, the above isn't tested, of course. But at least it makes sense
and doesn't have odd volatiles or strange "set global state and
percolate it into percpu stuff by magic". It *might* work.

Would you perhaps want to add a percpu section that is
read-after-boot? Maybe. But honestly, if the attacker can already
modify arbitrary percpu data, you may have bigger issues than cr4.

Linus