Re: [tip: x86/boot] x86/boot: Use 32-bit XOR to clear registers

From: Ard Biesheuvel
Date: Fri Mar 01 2024 - 11:36:04 EST


On Fri, 1 Mar 2024 at 17:02, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 01/03/2024 1:10 pm, Ard Biesheuvel wrote:
> > On Fri, 1 Mar 2024 at 13:51, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >> On Fri, Mar 1, 2024 at 1:45 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >>> On Fri, 1 Mar 2024 at 13:39, tip-bot2 for Uros Bizjak
> >>> <tip-bot2@xxxxxxxxxxxxx> wrote:
> >>>> The following commit has been merged into the x86/boot branch of tip:
> >>>>
> >>>> Commit-ID: 721f791ce1cddfa5f2bf524ac14741bfa0f72697
> >>>> Gitweb: https://git.kernel.org/tip/721f791ce1cddfa5f2bf524ac14741bfa0f72697
> >>>> Author: Uros Bizjak <ubizjak@xxxxxxxxx>
> >>>> AuthorDate: Wed, 24 Jan 2024 11:38:59 +01:00
> >>>> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> >>>> CommitterDate: Fri, 01 Mar 2024 12:47:37 +01:00
> >>>>
> >>>> x86/boot: Use 32-bit XOR to clear registers
> >>>>
> >>>> x86_64 zero extends 32-bit operations, so for 64-bit operands,
> >>>> XORL r32,r32 is functionally equal to XORQ r64,r64, but avoids
> >>>> a REX prefix byte when legacy registers are used.
> >>>>
> >>> ... and so this change is pointless churn when not using legacy
> >>> registers, right?
> >> Although there is no code size change with REX registers, it would
> >> look weird to use XORQ with REX registers and XORL with legacy regs.
> > You are changing an isolated occurrence of XORQ into XORL on the basis
> > that XORQ 'looks weird', and would produce a longer opcode if the
> > occurrence in question would be using a different register than it
> > actually uses.
> >
> > Apologies for the bluntness, but in my book, this really falls firmly
> > into the 'pointless churn' territory. The startup code is not
> > performance critical, neither in terms of size nor in speed, and so
> > I'd prefer to avoid these kinds of changes. Just my 2c, though - Ingo
> > has already merged the patch.
>
> Without trying to get into an argument here...
>

No worries, we're all friends here :-)

> The better reason is that Silvermont Atoms don't recognise the 64bit
> form as a zeroing idiom. They only recognise the 32bit form of the idiom.
>
> Therefore in fastpaths it is (marginally) important to xorl %r15d, %r15d
> rather than xorq %r15, %r15.
>
> But this instance is not a fastpath, and it also doesn't save any
> encoding space, so I'm not sure it was really worth changing.
>

Yeah, that is my point, really. But let's move on. And apologies to
Uros for the tone - it didn't sound as grumpy in my head when I typed
it as it does now reading back the thread.