Re: [PATCH v2] x86-64: use 32-bit XOR to zero registers

From: Ingo Molnar
Date: Thu Jul 05 2018 - 03:12:24 EST

* Pavel Machek <pavel@xxxxxx> wrote:

> On Mon 2018-07-02 04:31:54, Jan Beulich wrote:
> > Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing
> > idioms don't require execution bandwidth, as they're being taken care
> > of in the frontend (through register renaming). Use 32-bit XORs instead.
> >
> > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> > @@ -702,7 +702,7 @@ _no_extra_mask_1_\@:
> >
> > # GHASH computation for the last <16 Byte block
> > GHASH_MUL \AAD_HASH, %xmm13, %xmm0, %xmm10, %xmm11, %xmm5, %xmm6
> > - xor %rax,%rax
> > + xor %eax, %eax
> >
> > mov %rax, PBlockLen(%arg2)
> > jmp _dec_done_\@
> This is rather subtle... and looks like a bug. To zero 64-bit
> register, you zero its lower half, relying on implicit zeroing of the
> upper half. Wow.
> Perhaps we should get comments in the code? Because the explicit code
> is more readable...

The automatic zero-extension of 32-bit ops to the full 64-bit register is a basic,
fundamental and well-known x86-64 idiom in use in literally hundreds of places in
x86-64 assembly code.

We sometimes document zero-extension on entry boundaries where we want to make it
really clear what information gets (and what doesn't get) into the kernel, but
generally it only needs documentation is the (very rare) cases where it's *not*

Also, why would it be a bug?