Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
From: Thomas Gleixner
Date: Wed Sep 29 2021 - 08:28:15 EST
On Wed, Sep 29 2021 at 11:54, Peter Zijlstra wrote:
> On Fri, Sep 24, 2021 at 04:03:53PM -0700, Andy Lutomirski wrote:
>> I think the perfect and the good are a bit confused here. If we go for
>> "good", then we have an mm owning a PASID for its entire lifetime. If
>> we want "perfect", then we should actually do it right: teach the
>> kernel to update an entire mm's PASID setting all at once. This isn't
>> *that* hard -- it involves two things:
>>
>> 1. The context switch code needs to resync PASID. Unfortunately, this
>> adds some overhead to every context switch, although a static_branch
>> could minimize it for non-PASID users.
>
>> 2. A change to an mm's PASID needs to sent an IPI, but that IPI can't
>> touch FPU state. So instead the IPI should use task_work_add() to
>> make sure PASID gets resynced.
>
> What do we need 1 for? Any PASID change can be achieved using 2 no?
>
> Basically, call task_work_add() on all relevant tasks [1], then IPI
> spray the current running of those and presto.
>
> [1] it is nigh on impossible to find all tasks sharing an mm in any sane
> way due to CLONE_MM && !CLONE_THREAD.
Why would we want any of that at all?
Process starts, no PASID assigned.
bind to device -> PASID is allocated and assigned to the mm
some task of the process issues ENQCMD -> #GP -> write PASID MSR
After that the PASID is saved and restored as part of the XSTATE and
there is no extra overhead in context switch or return to user space.
All tasks of the process which did never use ENQCMD don't care and their
PASID xstate is in init state.
There is absolutely no point in enforcing that all tasks of the process
have the PASID activated immediately when it is assigned. If they need
it they get it via the #GP fixup and everything just works.
Looking at that patch again, none of this muck in fpu__pasid_write() is
required at all. The whole exception fixup is:
if (!user_mode(regs))
return false;
if (!current->mm->pasid)
return false;
if (current->pasid_activated)
return false;
wrmsrl(MSR_IA32_PASID, current->mm->pasid);
current->pasid_activated = true;
return true;
There is zero requirement to look at TIF_NEED_FPU_LOAD or
fpregs_state_valid() simply because the #GP comes straight from user
space which means the FPU registers contain the current tasks user space
state.
If TIF_NEED_FPU_LOAD would be set or fpregs_state_valid() would be false
after the user_mode() check then this would simply be a bug somewhere
else and has nothing to do with this PASID fixup.
So no need for magic update_one_xstate_feature() wrappers, no
concurrency concerns, nothing.
It's that simple, really. Anything more complex is just a purely
academic exercise which creates more problems than it solves.
Thanks,
tglx