Re: How should we handle illegal task FPU state?

From: Andy Lutomirski
Date: Thu Oct 08 2020 - 20:08:51 EST


On Thu, Oct 8, 2020 at 11:08 AM Yu, Yu-cheng <yu-cheng.yu@xxxxxxxxx> wrote:
>
> On 10/1/2020 3:04 PM, Andy Lutomirski wrote:
> > On Thu, Oct 1, 2020 at 2:50 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >>
> >> On 10/1/20 1:58 PM, Sean Christopherson wrote:
> >>> One thought for a lowish effort approach to pave the way for CET would be to
> >>> try XRSTORS multiple times in switch_fpu_return(). If the first try fails,
> >>> then WARN, init non-supervisor state and try a second time, and if _that_ fails
> >>> then kill the task. I.e. do the minimum effort to play nice with bad FPU
> >>> state, but don't let anything "accidentally" turn off CET.
> >>
> >> I'm not sure we should ever keep running userspace after an XRSTOR*
> >> failure. For MPX, this might have provided a nice, additional vector
> >> for an attacker to turn off MPX. Same for pkeys if we didn't correctly
> >> differentiate between the hardware init state versus the "software init"
> >> state that we keep in init_task.
> >>
> >> What's the advantage of letting userspace keep running after we init its
> >> state? That it _might_ be able to recover?
> >
> > I suppose we can kill userspace and change that behavior only if
> > someone complains. I still think it would be polite to try to dump
> > core, but that could be tricky with the current code structure. I'll
> > try to whip up a patch. Maybe I'll add a debugfs file to trash MXCSR
> > for testing.
> >
>
> One complication of letting XRSTORS fail is exit_to_user_mode_prepare()
> will need to go back to exit_to_user_mode_loop() again (or repeat some
> parts of it).
>
> Currently, when exit_to_user_mode_loop() exits, xstates should have been
> validated earlier and to be restored shortly. At this stage, XRSTORS
> should not fault. If we need to kill the task, we should have done that
> earlier.

We can still do_exit(). I'll ponder this.