Re: Candidate Linux ABI for Intel AMX and hypothetical new related features
From: Andy Lutomirski
Date: Tue Mar 30 2021 - 16:21:36 EST
> On Mar 30, 2021, at 12:12 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>
> On 3/30/21 10:56 AM, Len Brown wrote:
>> On Tue, Mar 30, 2021 at 1:06 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>>> On Mar 30, 2021, at 10:01 AM, Len Brown <lenb@xxxxxxxxxx> wrote:
>>>> Is it required (by the "ABI") that a user program has everything
>>>> on the stack for user-space XSAVE/XRESTOR to get back
>>>> to the state of the program just before receiving the signal?
>>> The current Linux signal frame format has XSTATE in uncompacted format,
>>> so everything has to be there.
>>> Maybe we could have an opt in new signal frame format, but the details would need to be worked out.
>>>
>>> It is certainly the case that a signal should be able to be delivered, run “async-signal-safe” code,
>>> and return, without corrupting register contents.
>> And so an an acknowledgement:
>>
>> We can't change the legacy signal stack format without breaking
>> existing programs. The legacy is uncompressed XSTATE. It is a
>> complete set of architectural state -- everything necessary to
>> XRESTOR. Further, the sigreturn flow allows the signal handler to
>> *change* any of that state, so that it becomes active upon return from
>> signal.
>
> One nit with this: XRSTOR itself can work with the compacted format or
> uncompacted format. Unlike the XSAVE/XSAVEC side where compaction is
> explicit from the instruction itself, XRSTOR changes its behavior by
> reading XCOMP_BV. There's no XRSTORC.
>
> The issue with using the compacted format is when legacy software in the
> signal handler needs to go access the state. *That* is what can't
> handle a change in the XSAVE buffer format (either optimized/XSAVEOPT,
> or compacted/XSAVEC).
The compacted format isn’t compact enough anyway. If we want to keep AMX and AVX512 enabled in XCR0 then we need to further muck with the format to omit the not-in-use features. I *think* we can pull this off in a way that still does the right thing wrt XRSTOR.
If we go this route, I think we want a way for sigreturn to understand a pointer to the state instead of inline state to allow programs to change the state. Or maybe just to have a way to ask sigreturn to skip the restore entirely.