On Thu, Oct 1, 2020 at 2:50 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
On 10/1/20 1:58 PM, Sean Christopherson wrote:
One thought for a lowish effort approach to pave the way for CET would be to
try XRSTORS multiple times in switch_fpu_return(). If the first try fails,
then WARN, init non-supervisor state and try a second time, and if _that_ fails
then kill the task. I.e. do the minimum effort to play nice with bad FPU
state, but don't let anything "accidentally" turn off CET.
I'm not sure we should ever keep running userspace after an XRSTOR*
failure. For MPX, this might have provided a nice, additional vector
for an attacker to turn off MPX. Same for pkeys if we didn't correctly
differentiate between the hardware init state versus the "software init"
state that we keep in init_task.
What's the advantage of letting userspace keep running after we init its
state? That it _might_ be able to recover?
I suppose we can kill userspace and change that behavior only if
someone complains. I still think it would be polite to try to dump
core, but that could be tricky with the current code structure. I'll
try to whip up a patch. Maybe I'll add a debugfs file to trash MXCSR
for testing.