Re: Candidate Linux ABI for Intel AMX and hypothetical new related features
From: Florian Weimer
Date: Mon Mar 29 2021 - 12:49:02 EST
* Len Brown via Libc-alpha:
>> In particular, the library may use instructions that main() doesn't know exist.
>
> And so I'll ask my question another way.
>
> How is it okay to change the value of XCR0 during the run time of a
> program?
>
> I submit that it is not, and that is a deal-killer for a
> request/release API.
>
> eg. main() doesn't know that the math library wants to use AMX, and
> neither does the threading library. So main() doesn't know to call
> the API before either library is invoked. The threading library
> starts up and creates user-space threads based on the initial value
> from XCR0. Then the math library calls the API, which adds bits to
> XCRO, and then the user-space context switch in the threading
> library corrupts data because the new XCR0 size doesn't match the
> initial size.
I agree that this doesn't quite work. (Today, it's not the thread
library, but the glibc dynamic loader trampoline.)
I disagree that CPU feature enablement has been a failure. I think we
are pretty good at enabling new CPU features on older operating
systems, not just bleeding edge mainline kernels. Part of that is
that anything but the kernel stays out of the way, and most features
are available directly via inline assembly (you can even use .byte
hacks if you want). There is no need to switch to new userspace
libraries, compile out-of-tree kernel drivers that have specific
firmware requirements, and so on.
If the operations that need a huge context can be made idempotent,
with periodic checkpoints, it might be possible to avoid saving the
context completely by some rseq-like construct.