Re: [RFC PATCH 13/22] x86/fpu/xstate: Expand dynamic user state area on first use

From: Dave Hansen
Date: Wed Oct 14 2020 - 12:29:26 EST


On 10/14/20 9:10 AM, Andy Lutomirski wrote:
>> Actually, I think the modified optimization would survive such a scheme:
>>
>> * copy page array into percpu area
>> * XRSTORS from percpu area, modified optimization tuple is saved
>> * run userspace
>> * XSAVES back to percpu area. tuple matches, modified optimization
>> is still in play
>> * copy percpu area back to page array
>>
>> Since the XRSTORS->XSAVES pair is both done to the percpu area, the
>> XSAVE tracking hardware never knows it isn't working on the "canonical"
>> buffer (the page array).
> I was suggesting something a little bit different. We'd keep XMM,
> YMM, ZMM, etc state stored exactly the way we do now and, for
> AMX-using tasks, we would save the AMX state in an entirely separate
> buffer. This way the pain of having a variable xstate layout is
> confined just to AMX tasks.

OK, got it.

So, we'd either need a second set of XSAVE/XRSTORs, or "manual" copying
of the registers out to memory. We can preserve the modified
optimization if we're careful about ordering, but only for *ONE* of the
XSAVE buffers (if we use two).

> I'm okay with vmalloc() too, but I do think we need to deal with the
> various corner cases like allocation failing.

Yeah, agreed about handling the corner cases. Also, if we preserve
plain old vmalloc() for now, we need good tracepoints or stats so we can
precisely figure out how many vmalloc()s (and IPIs) are due to AMX.