Re: [PATCH 1/2] KVM: arm64: Fix sign-extension of MMIO loads

From: Fuad Tabba

Date: Tue Jun 23 2026 - 09:54:46 EST


On Tue, 23 Jun 2026 at 14:08, Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> On Mon, 22 Jun 2026 20:07:00 +0100,
> Fuad Tabba <fuad.tabba@xxxxxxxxx> wrote:
> >
> > A sign-extending load from MMIO (LDRSB, LDRSH, LDRSW) delivers a
> > zero-extended value: in kvm_handle_mmio_return(), vcpu_data_host_to_guest()
> > masks the data to the access width after sign-extension, stripping the
> > sign bits.
> >
> > Move vcpu_data_host_to_guest() ahead of sign-extension so the width mask
> > runs first. trace_kvm_mmio() moves with it and keeps reporting the raw
> > access-width data.
> >
> > Fixes: b30070862edbd ("ARM64: KVM: MMIO support BE host running LE code")
> > Signed-off-by: Fuad Tabba <fuad.tabba@xxxxxxxxx>
> > ---
> > arch/arm64/kvm/mmio.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> > index e2285ed8c91de..d1c3a352d5a22 100644
> > --- a/arch/arm64/kvm/mmio.c
> > +++ b/arch/arm64/kvm/mmio.c
> > @@ -126,6 +126,10 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
> > len = kvm_vcpu_dabt_get_as(vcpu);
> > data = kvm_mmio_read_buf(run->mmio.data, len);
> >
> > + trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr,
> > + &data);
> > + data = vcpu_data_host_to_guest(vcpu, data, len);
>
> This helper is in charge of the endianness of the read from the
> guest's PoV. How can the sign-extension (which happens from the host's
> perspective) correctly work if it takes place after the byte swapping?
>
> To me, the real issue appears to be in the that helper, which swaps
> the data based on the size of the access instead of the width of the
> register.
>
> Or am I once more completely confused with the endianness crap?

Endianess always confuses me too :D

My reading of the ARM ARM is that the byte reversal is keyed on the access
size there too. It lives in Mem{size}, with the register width handled
separately by SignExtend(regsize):

data = Mem[address, 2]; // byte-reversed by the access size, BE
X[t] = SignExtend(data, regsize);

So vcpu_data_host_to_guest(..., len) swapping by len matches the Mem-side
reversal. Swapping by the register width would reorder bytes that were never
loaded. An LDRSH into Wt reads 2 bytes but would bswap 4: the halfword
reaches the helper as 0x0180 host-native, cpu_to_be32 turns it into
0x80010000 instead of the 0x8001 cpu_to_be16 gives, and it never sign-extends
to 0xffff8001.

If that reading holds, none of the helper's ops are individually wrong, and
the only bug was the order, with the sign-extend running before the swap and
the width mask then dropping it. But I've gone round in circles on endianness
before (to say the least), so please say if I've done it again.

Cheers,
/fuad

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.