Re: [PATCH 0/3] KVM: arm64: nv: Shadow ptdump fixes
From: Wei-Lin Chang
Date: Fri Jun 26 2026 - 05:38:23 EST
On Thu, Jun 25, 2026 at 10:54:40AM +0100, Marc Zyngier wrote:
> On Thu, 25 Jun 2026 08:47:04 +0100,
> Wei-Lin Chang <weilin.chang@xxxxxxx> wrote:
> >
> > I don't see a way out with this per-mmu file scheme. The core issue is
> > mmus have a different lifetime than the VM's debugfs directory, and
> > both's removal can happen in parallel, i.e. the VM debugfs directory
> > can be removed anytime we are in mmu notifier release freeing the mmus
> > and their shadow ptdump files.
>
> Why isn't that a problem with the existing S2 ptdump code?
For existing stuff, in terms of the ptdump files:
- For the canonical s2, the ptdump files are only removed in
kvm_destroy_vm_debugfs().
- For nested s2, files are removed in two places: "manually" removed
at mmu<->context unbind time in get_s2_mmu_nested() and in
kvm_destroy_vm_debugfs(). get_s2_mmu_nested() is only called before
kvm_destroy_vm_debugfs() of course.
So no dentry UAF in the current code.
Putting everything together, the situation is pretty complicated. The
canonical mmu, nested mmu, canonical mmu->pgt, nested mmu->pgt all have
different lifetimes, and kvm_destroy_vm_debugfs() can run in parallel
with mmu notifier release (like said above), which frees nested mmus and
canonical + nested pgts. Additionally, the .show() callback can be
called after mmu notifier release (Sashiko).
Now having thought about this again, per-mmu file can work, just defend
against freed mmu or pgt.
What I'm thinking now:
- Just have all ptdump files removed at kvm_destroy_vm_debugfs(), to
avoid accessing the dentry in mmu notifier release.
- Revert effects of 204f7c018d76 ("KVM: arm64: ptdump: Make KVM ptdump
code s2 mmu aware") because the mmu pointer stored in i_private and
the ptdump state is simply invalid after mmu notifier release for
nested.
- Only store the kvm pointer in inode->i_private or seq_file->private.
- For nested, check both mmu and mmu->pgt in .show() under the
mmu_lock. For canonical only mmu->pgt check will be needed.
- Guard nested_mmus freeing and setting nested_mmus = NULL with the
mmu_lock so .show() won't have nested mmus disappear after checking
nested_mmus exist.
I *think* this solves all problems known at the moment...
Thanks,
Wei-Lin Chang
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.