RE: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

From: Luck, Tony
Date: Tue Nov 11 2014 - 20:07:14 EST


> I've thought about one sneaky option. If we can reliably determine
> that we're an innocent bystander of a broadcast #MC, can we send an
> IPI-to-self and return without clearing MCIP? Then we get another
> interrupt as soon as interrupts are enabled, and we can clear MCIP at
> a time when we're definitely not running on the IST stack.

Innocent bystanders have RIPV=1, EIPV=0 in MCG_STATUS ... so they
are quite easy to spot. Perhaps we might look at subverting the silly
broadcast by just having them immediately clear MCG_STATUS and iret
(i.e. not go to do_machine_check() at all). That would require lots of
surgery to do_machine_check() and friends - now it wouldn't be sure
how many processors to expect to show up. It also opens a different
window - once they are back running normal code they might trip another
machine check while the victims of the first are still processing - so
another "boom, you're dead". The advantage of hitting everyone
with the machine check is that it lessens the chance that another will
happen as everyone is running looking at a few pages of kernel code
& data.

The worrying part in that is "as soon as interrupts are enabled". Until
we do clear MCIP we're sitting in a mode where another machine check
means instant death no saving throw. Nominally better than the "we'll
mess the stack up for you" that we are trying to avoid - but the old window
is quite short and known to be bounded. The new one might be a lot bigger.

-Tony