Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

From: Andy Lutomirski
Date: Wed Mar 24 2021 - 17:27:37 EST





> On Mar 24, 2021, at 2:09 PM, Len Brown <lenb@xxxxxxxxxx> wrote:
>
> On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 <jing2.liu@xxxxxxxxxxxxxxx> wrote:
>
>>> IMO, the problem with AVX512 state
>>> is that we guaranteed it will be zero for XINUSE=0.
>>> That means we have to write 0's on saves.
>
>> why "we have to write 0's on saves" when XINUSE=0.
>>
>> Since due to SDM, if XINUSE=0, the XSAVES will *not* save the data and
>> xstate_bv bit is 0; if use XSAVE, it need save the state but
>> xstate_bv bit is also 0.
>>> It would be better
>>> to be able to skip the write -- even if we can't save the space
>>> we can save the data transfer. (This is what we did for AMX).
>> With XFD feature that XFD=1, XSAVE command still has to save INIT state
>> to the area. So it seems with XINUSE=0 and XFD=1, the XSAVE(S) commands
>> do the same that both can help save the data transfer.
>
> Hi Jing, Good observation!
>
> There are 3 cases.
>
> 1. Task context switch save into the context switch buffer.
> Here we use XSAVES, and as you point out, XSAVES includes
> the compaction optimization feature tracked by XINUSE.
> So when AMX is enabled, but clean, XSAVES doesn't write zeros.
> Further, it omits the buffer space for AMX in the destination altogether!
> However, since XINUSE=1 is possible, we have to *allocate* a buffer
> large enough to handle the dirty data for when XSAVES can not
> employ that optimization.
>
> 2. Entry into user signal handler saves into the user space sigframe.
> Here we use XSAVE, and so the hardware will write zeros for XINUSE=0,
> and for AVX512, we save neither time or space.
>
> My understanding that for application compatibility, we can *not* compact
> the destination buffer that user-space sees. This is because existing code
> may have adopted fixed size offsets. (which is unfortunate).
>
> And so, for AVX512, we both reserve the space, and we write zeros
> for clean AVX512 state.
>
> For AMX, we must still reserve the space, but we are not going to write zeros
> for clean state. We so this in software by checking XINUSE=0, and clearing
> the xstate_bf for the XSAVE. As a result, for XINUSE=0, we can skip
> writing the zeros, even though we can't compress the space.

Why?

>
> 3. user space always uses fully uncompacted XSAVE buffers.
>

There is no reason we have to do this for new states. Arguably we shouldn’t for AMX to avoid yet another altstack explosion.