Re: [PATCH v29 23/32] x86/cet/shstk: Add user-mode shadow stack support

From: Dave Hansen
Date: Wed Sep 01 2021 - 11:24:34 EST


On 9/1/21 6:00 AM, Borislav Petkov wrote:
>> So, this code takes the coward's way out: it *forces* TIF_NEED_FPU_LOAD
>> to be clear by making the registers live with fpregs_restore_userregs().
>> That lets it just use WRMSR instead of dealing with the XSAVE buffer
>> directly. If it didn't do this with the *WHOLE* set of user FPU state,
>> we'd need more fine-granted "NEED_*_LOAD" tracking than our one FPU bit.
>>
>> This is also *only* safe because the task is newly-exec()'d and the FPU
>> state was just reset. Otherwise, we might have had to worry that the
>> non-PL3 SSPs have garbage or that non-SHSTK bits are set in MSR_IA32_U_CET.
>>
>> That said, after staring at it, I *think* this code is functionally
>> correct and OK performance-wise.
> Right, except that that is being done in
> setup_signal_shadow_stack()/restore_signal_shadow_stack() too, for the
> restore token.
>
> Which means, a potential XRSTOR each time just for a single MSR. That
> means, twice per signal in the worst case.
>
> Which means, shadow stack should be pretty noticeable in signal-heavy
> benchmarks...

Ahh, good point. I totally missed the signal side.

Yu-cheng, it would probably be wise to take a look at where
TIF_NEED_FPU_LOAD is set in the signal paths. I suspect the new shadow
stack code is clearing it pretty quickly. That's not *necessarily* a
waste because it has to be XRSTOR'd somewhere before returning to
userspace. But, it does impose an extra XSAVE/XRSTOR burden if the code
is preempted somewhere between
setup_signal_shadow_stack()/restore_signal_shadow_stack() and the return
to usersspace.

Some performance numbers would be nice.

Even nicer would be to make your code work without requiring
TIF_NEED_FPU_LOAD to be clear, or actively clearing it.