Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

From: Dave Hansen
Date: Wed Mar 24 2021 - 10:25:53 EST


On 3/23/21 2:52 PM, Bae, Chang Seok wrote:
>> "System software may disable use of Intel AMX by clearing XCR0[18:17], by
>> clearing CR4.OSXSAVE, or by setting IA32_XFD[18]. It is recommended that
>> system software initialize AMX state (e.g., by executing TILERELEASE)
>> before doing so. This is because maintaining AMX state in a
>> non-initialized state may have negative power and performance
>> implications."
>>
>> I'm not seeing anything related to this. Is this a recommendation
>> which can be ignored or is that going to be duct taped into the code
>> base once the first user complains about slowdowns of their non AMX
>> workloads on that machine?
> I think this part in the doc is worth to be mentioned at first:
>
> “The XTILEDATA state component is very large, and an operating system may
> prefer not to allocate memory for the XTILEDATA state of every user
> thread. Such an operating system that enables Intel AMX might prefer to
> prevent specific user threads from using the feature. An extension called
> extended feature disable (XFD) is added to the XSAVE feature set to
> support such a usage. XFD is described in Section 3.2.6.”
>
> So, in this series, instead of saving this state always, the state is saved
> only when used. XFD helps to detect each thread’s first use of those
> registers. Thus, the XFD’s MSR bit is maintained as per-task here.

This doesn't really have anything to do with XFD.

The spec says, basically, "as long as you have AMX state in the
registers, you may pay a penalty".

When we switch between userspace tasks, AMX gets automatically
reinitialized by XRSTOR if the task to which we switch is not using AMX.
All is good there.

But, what if we remain in the kernel? Let's say kswapd is going to run
for a while. Does kswapd pay the AMX-not-in-init-state penalty? Or,
what if we want to go to idle? Does AMX state affect *how* idle the CPU
can go?

We probably want to actively go out and zap AMX state at some
well-defined boundary. It's radioactive. Task switching seems as sane
a place as any to do that.