Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP

From: Dave Hansen
Date: Tue Sep 28 2021 - 15:19:37 EST


On 9/28/21 11:50 AM, Luck, Tony wrote:
> On Mon, Sep 27, 2021 at 04:51:25PM -0700, Dave Hansen wrote:
...
>> 1. Hide whether we need to write to real registers
>> 2. Hide whether we need to update the in-memory image
>> 3. Hide other FPU infrastructure like the TIF flag.
>> 4. Make the users deal with a *whole* state in the replace API
>
> Is that difference just whether you need to save the
> state from registers to memory (for the "update" case)
> or not (for the "replace" case ... where you can ignore
> the current register, overwrite the whole per-feature
> xsave area and mark it to be restored to registers).
>
> If so, just a "bool full" argument might do the trick?

I want to be able to hide the complexity of where the old state comes
from. It might be in registers or it might be in memory or it might be
*neither*. It's possible we're running with stale register state and a
current->...->xsave buffer that has XFEATURES&XFEATURE_FOO 0.

In that case, the "old" copy might be memcpy'd out of the init_task.
Or, for pkeys, we might build it ourselves with init_pkru_val.

> Also - you have a "tsk" argument in your pseudo code. Is
> this needed? Are there places where we need to perform
> these operations on something other than "current"?

Two cases come to mind:
1. Fork/clone where we are doing things to our child's XSAVE buffer
2. ptrace() where we are poking into another task's state

ptrace() goes for the *whole* buffer now. I'm not sure it would need
this per-feature API. I just call it out as something that we might
need in the future.

> pseudo-code:
>
> void *begin_update_one_xsave_feature(enum xfeature xfeature, bool full)
> {
> void *addr;
>
> BUG_ON(!(xsave->header.xcomp_bv & xfeature));
>
> addr = __raw_xsave_addr(xsave, xfeature);
>
> fpregs_lock();
>
> if (full)
> return addr;

If the feature is marked as in the init state in the buffer
(XSTATE_BV[feature]==0), this addr *could* contain total garbage. So,
we'd want to make sure that the memory contents have the init state
written before handing them back to the caller. That's not strictly
required if the user is writing the whole thing, but it's the nice thing
to do.

> if (xfeature registers are "live")
> xsaves(xstate, 1 << xfeature);

One little note: I don't think we would necessarily need to do an XSAVES
here. For PKRU, for instance, we could just do a rdpkru.

> return addr;
> }
>
> void finish_update_one_xsave_feature(enum xfeature xfeature)
> {
> mark feature modified

I think we'd want to do this at the "begin" time. Also, do you mean we
should set XSTATE_BV[feature]?

> set TIF bit

Since the XSAVE buffer was updated, it now contains the canonical FPU
state. It may have diverged from the register state, thus we need to
set TIF_NEED_FPU_LOAD.

It's also worth noting that we *could*:

xrstors(xstate, 1<<xfeature);

as well. That would bring the registers back up to day and we could
keep TIF_NEED_FPU_LOAD==0.

> fpregs_unlock();
> }
>
> -Tony
>