Re: Candidate Linux ABI for Intel AMX and hypothetical new related features

From: Andy Lutomirski
Date: Fri May 21 2021 - 12:31:54 EST




On Fri, May 21, 2021, at 9:19 AM, Florian Weimer wrote:
> * Dave Hansen:
>
> > On 5/21/21 7:44 AM, Florian Weimer wrote:
> >> * Dave Hansen via Libc-alpha:
> >>> Our system calls are *REALLY* fast. We can even do a vsyscall for this
> >>> if we want to get the overhead down near zero. Userspace can also cache
> >>> the "I did the prctl()" state in thread-local storage if it wants to
> >>> avoid the syscall.
> >> Why can't userspace look at XCR0 to make the decision?
> >
> > The thing we're trying to avoid is a #NM exception from XFD (the new
> > first-use detection feature) that occurs on the first use of AMX.
> > XCR0 will have XCR0[AMX]=1, even if XFD is "armed" and ready to
> > generate the #NM.
>
> I see. So essentially the hardware wants to offer transparent
> initialize-on-use, but Linux does not seem to want to implement it this
> way.
>
> Is there still a chance to bring the hardware and Linux into alignment?

arch_prctl(SET_XSTATE_INIT_ON_FIRST_USE, TILE_STUFF);?

As long as this is allowed to fail, I don’t have a huge problem with it.

I think several things here are regrettable:

1. Legacy XSTATE code might assume that XCR0 is a constant.

2. Intel virt really doesn’t like us context switching XCR0, although we might say that this is Intel’s fault and therefore Intel’s problem. AMD hardware doesn’t appear to have this issue.

3. AMX bring tangled up in XSTATE is unfortunate. The whole XSTATE mechanism is less than amazing.

IMO the best we can make of this whole situation is to make XCR0 dynamic, but the legacy compatibility issues are potentially problematic.

>
> Thanks,
> Florian
>
>