Re: [PATCH 0/6] Introduce CET supervisor state support

From: Dave Hansen
Date: Thu Jul 11 2024 - 16:58:47 EST


On 7/8/24 20:17, Yang, Weijiang wrote:
> So I'm not sure whether XFEATURE_MASK_KERNEL_DYNAMIC and related changes
> are worth or not for this series.
>
> Could you share your thoughts?

First of all, I really do appreciate when folks make the effort to _try_
to draw their own conclusions before asking the maintainers to share
theirs. Next time, OK? ;)

But here goes. So we've basically got three cases. Here's a fancy table:

> https://docs.google.com/spreadsheets/d/e/2PACX-1vROHIgrtHzUJmdlzT7D7tuVzgM8AMlK2XlorvFIJvk-I0NjD7A-T_qntjz7cUJlCScfWGtSfPK30Xtu/pubhtml

... and the same in ASCII

Case |IA32_XSS[12] | Space | RFBM[12] | Drop%
-----+-------------+-------+----------+------
1 | 0 | None | 0 | 0.0%
2 | 1 | None | 0 | 0.2%
3 | 1 | 24B? | 1 | 0.2%

Case 1 is the baseline of course. Case 2 avoids allocating space for
CET and also leans on the kernel to set RFBM[12]==0 and tell the
hardware not to write CET-S state. Case 3 wastes the CET-S space in
each task and also leans on the hardware init optimization to avoid
writing out CET-S space on each XSAVES.

#1 is: 0 lines of code.
#2 is: 5 files changed, 90 insertions(+), 27 deletions(-)
#3 is: very few lines of code, nearing zero

#2 and #3 have the same performance.

So we're down to choosing between

* $BYTES space in 'struct fpu' (on hardware supporting CET-S)

or

* ~100 loc

$BYTES is 24, right? Did I get anything wrong?

So, here's my stake in the ground: I think the 100 lines of code is
probably worth it. But I also hate complicating the FPU code, so I'm
also somewhat drawn to just eating the 24 bytes and moving on.

But I'm still in the "case 2" camp.

Anybody disagree?