Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

From: Len Brown
Date: Mon Mar 29 2021 - 11:44:53 EST


On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> > I found the author of this passage, and he agreed to revise it to say this
> > was targeted primarily at VMMs.
>
> Why would this only a problem for VMMs?

VMMs may have to emulate different hardware for different guest OS's,
and they would likely "context switch" XCR0 to achieve that.

As switching XCR0 at run-time would confuse the heck out of user-space,
it was not imagined that a bare-metal OS would do that.

But yes, if a bare metal OS doesn't support any threading libraries
that query XCR0 with xgetbv, and they don't care about the performance
impact of switching XCR0, they could choose to switch XCR0 and
would want to TILERELEASE to assure C6 access, if it is enabled.

> > "negative power and performance implications" refers to the fact that
> > the processor will not enter C6 when AMX INIT=0, instead it will demote
> > to the next shallower C-state, eg C1E.
> >
> > (this is because the C6 flow doesn't save the AMX registers)
> >
> > For customers that have C6 enabled, the inability of a core to enter C6
> > may impact the maximum turbo frequency of other cores.
>
> That's the same on bare metal, right?

Yes, the hardware works exactly the same way.

thanks,
Len Brown, Intel Open Source Technology Center