Re: [PATCH v5 24/28] x86/fpu/xstate: Use per-task xstate mask for saving xstate in signal frame
From: Len Brown
Date: Mon May 24 2021 - 14:06:52 EST
On Sun, May 23, 2021 at 11:15 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
> If I'm reading this right, it means that tasks that have ever used AMX
> get one format and tasks that haven't get another one.
No. The format of the XSTATE on the signal stack is uncompressed XSAVE
format for both AMX and non-AMX tasks, both before and after this patch.
That is because XSAVE gets the format from XCR0. It gets the fields
to write from the run-time parameter.
So the change here allows a non-AMX task to skip writing data (zeros)
to the AMX region of its XSTATE buffer.
The subsequent patch adds the further optimization of (manually) checking
for INIT state for an AMX-task and also skip writing data (zeros) in that case.
We should have done this optimization for AVX-512, but instead we
guaranteed writing zeros, which I think is a waste of both transfer time
and cache footprint.
Len Brown, Intel Open Source Technology Center