Re: [PATCH 0/2] KVM: arm64: Fix host/hyp tracking on share/unshare hypercall failure
From: Fuad Tabba
Date: Fri May 29 2026 - 04:22:55 EST
On Fri, 29 May 2026 at 09:15, Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> On Fri, 29 May 2026 09:05:35 +0100,
> Fuad Tabba <tabba@xxxxxxxxxx> wrote:
> >
> > On Fri, 29 May 2026 at 09:02, Vincent Donnefort <vdonnefort@xxxxxxxxxx> wrote:
> > >
> > > On Fri, May 29, 2026 at 08:43:39AM +0100, tabba@xxxxxxxxxx wrote:
> > > > Hi folks,
> > > >
> > > > Yet another bug I found while testing Sashiko locally with fixes to
> > > > review-prompts.
> > > >
> > > > share_pfn_hyp() and unshare_pfn_hyp() in arch/arm64/kvm/mmu.c
> > > > maintain a host-side RB-tree mirroring the set of pages shared with
> > > > EL2. Both invoke a hypercall that can fail (page-state mismatch,
> > > > EL2 refcount still held), but neither cleans up on failure:
> > > >
> > > > - share_pfn_hyp() inserts the tracking node before the hypercall
> > > > and leaves it in the tree on failure, leaking the allocation and
> > > > presenting a phantom share to a later unshare.
> > > >
> > > > - unshare_pfn_hyp() erases the tracking node before the hypercall;
> > > > on failure the host loses its record while EL2 still owns the
> > > > share, breaking later operations on the same pfn.
> > > >
> > > > Severity is low (no isolation impact) and the failure paths are rare
> > > > in practice, but the desync is real. Both patches are independent and
> > > > apply cleanly to current mainline. In other words, this can wait for
> > > > 7.2.
> > >
> > >
> > > I believe I fixed that here lore.kernel.org/all/acyKhZL2di_QQ9xm@xxxxxxxxxx but
> > > as Quentin pointed-out, there's absolutely no reason for the hypercall to fail.
> > > So I haven't sent a v2.
> >
> > At the very least we need to add a comment, otherwise, people like me
> > and LLMs like Sashiko would stumble upon it.
> >
> > That said, this fix adds no real overhead, makes the code clearer, and
> > guards us against a future where that call might fail.
> > Self-documenting in essense.
> >
> > WDYT?
>
> If a hypercall really cannot fail, why does it have a return value?
Good point. If we know it cannot fail, how about just `void`?
That said, Vincen't exact words are: `very much unlikely`, not the
same as cannot fail :)
https://lore.kernel.org/all/acyKhZL2di_QQ9xm@xxxxxxxxxx/
/fuad
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.