Re: [PATCH] x86-64: use 32-bit XOR to zero registers

From: Ingo Molnar
Date: Thu Jul 26 2018 - 15:06:40 EST



* Pavel Machek <pavel@xxxxxx> wrote:

> On Thu 2018-07-26 13:45:37, Ingo Molnar wrote:
> >
> > * Pavel Machek <pavel@xxxxxx> wrote:
> >
> > > On Tue 2018-06-26 08:38:22, Henrique de Moraes Holschuh wrote:
> > > > On Tue, 26 Jun 2018, Jan Beulich wrote:
> > > > > >>> On 25.06.18 at 18:33, <rdunlap@xxxxxxxxxxxxx> wrote:
> > > > > > On 06/25/2018 03:25 AM, Jan Beulich wrote:
> > > > > >> Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms - use
> > > > > >> 32-bit ones instead.
> > > > > >
> > > > > > Hmph. Is that considered a bug (errata)?
> > > > >
> > > > > No.
> > > > >
> > > > > > URL/references?
> > > > >
> > > > > Intel's Optimization Reference Manual says so (in rev 040 this is in section
> > > > > 16.2.2.5 "Zeroing Idioms" as a subsection of the Goldmont/Silvermont
> > > > > descriptions).
> > > > >
> > > > > > Are these changes really only zeroing the lower 32 bits of the register?
> > > > > > and that's all that the code cares about?
> > > > >
> > > > > No - like all operations targeting a 32-bit register, the result is zero
> > > > > extended to the entire 64-bit destination register.
> > > >
> > > > Missing information that would have been helpful in the commit message:
> > > >
> > > > When the processor can recognize something as a zeroing idiom, it
> > > > optimizes that operation on the front-end. Only 32-bit XOR r,r is
> > > > documented as a zeroing idiom according to the Intel optimization
> > > > manual. While a few Intel processors recognize the 64-bit version of
> > > > XOR r,r as a zeroing idiom, many won't.
> > > >
> > > > Note that the 32-bit operation extends to the high part of the 64-bit
> > > > register, so it will zero the entire 64-bit register. The 32-bit
> > > > instruction is also one byte shorter.
> > >
> > > Actually, I believe that should be comment in code.
> >
> > Agreed - mind sending a patch that adds it?
>
> Ok. Would /* write to low 32 bits clears high 32 bits, too */ be
> reasonable comment?

So I'd suggest putting the above description somewhere strategic - such as the top
of entry_64.S, or calling.h, or so?

Thanks,

Ingo