Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP

From: Luck, Tony
Date: Tue Sep 28 2021 - 14:50:35 EST


On Mon, Sep 27, 2021 at 04:51:25PM -0700, Dave Hansen wrote:
> That's close to what we want.
>
> The size should probably be implicit. If it isn't implicit, it needs to
> at least be double-checked against the state sizes.
>
> Not to get too fancy, but I think we also want to have a "replace"
> operation which is separate from the "update". Think of a case where
> you are trying to set a bit:
>
> struct pkru_state *pk = start_update_xstate(tsk, XSTATE_PKRU);
> pk->pkru |= 0x100;
> finish_update_xstate(tsk, XSTATE_PKRU, pk);
>
> versus setting a whole new value:
>
> struct pkru_state *pk = start_replace_xstate(tsk, XSTATE_PKRU);
> memset(pkru, sizeof(*pk), 0);
> pk->pkru = 0x1234;
> finish_replace_xstate(tsk, XSTATE_PKRU, pk);
>
> They look similar, but they actually might have very different costs for
> big states like AMX. We can also do some very different debugging for
> them. In normal operation, these _can_ just return pointers directly to
> the fpu...->xstate in some case. But, if we're debugging, we could
> kmalloc() a buffer and do sanity checking before it goes into the task
> buffer.
>
> We don't have to do any of that fancy stuff now. We don't even need to
> do an "update" if all we need for now for XFEATURE_PASID is a "replace".
>
> 1. Hide whether we need to write to real registers
> 2. Hide whether we need to update the in-memory image
> 3. Hide other FPU infrastructure like the TIF flag.
> 4. Make the users deal with a *whole* state in the replace API

Is that difference just whether you need to save the
state from registers to memory (for the "update" case)
or not (for the "replace" case ... where you can ignore
the current register, overwrite the whole per-feature
xsave area and mark it to be restored to registers).

If so, just a "bool full" argument might do the trick?

Also - you have a "tsk" argument in your pseudo code. Is
this needed? Are there places where we need to perform
these operations on something other than "current"?

pseudo-code:

void *begin_update_one_xsave_feature(enum xfeature xfeature, bool full)
{
void *addr;

BUG_ON(!(xsave->header.xcomp_bv & xfeature));

addr = __raw_xsave_addr(xsave, xfeature);

fpregs_lock();

if (full)
return addr;

if (xfeature registers are "live")
xsaves(xstate, 1 << xfeature);

return addr;
}

void finish_update_one_xsave_feature(enum xfeature xfeature)
{
mark feature modified
set TIF bit
fpregs_unlock();
}

-Tony