Re: [PATCH v3] KVM: PPC: Book3S HV: Validate arch_compat against host compatibility mode
From: Amit Machhiwal
Date: Tue Jun 16 2026 - 06:56:26 EST
Hi Ritesh,
Thank you for the review and feedback. Please find my response below.
On 2026/06/16 03:17 PM, Ritesh Harjani wrote:
> Amit Machhiwal <amachhiw@xxxxxxxxxxxxx> writes:
>
> > On IBM POWER systems, newer processor generations can operate in
> > compatibility modes corresponding to earlier generations. This becomes
> > relevant for nested virtualization, where nested KVM guests may need to
> > run with a specific processor compatibility level.
> >
> > Currently, when running a nested KVM guest (L2) inside a Power11 pSeries
> > logical partition (L1) booted in Power10 compatibility mode, the guest
> > fails to boot while setting 'arch_compat'. This happens because the CPU
> > class is derived from the hardware PVR (via mfspr()), which reflects the
> > physical processor generation (Power11), rather than the effective
> > compatibility mode (Power10).
> >
> > As a result, userspace may request a Power11 arch_compat for the L2
> > guest. However, the L1 partition, running in Power10 compatibility, has
> > only negotiated support up to Power10 with the Power Hypervisor (L0).
> > When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the
> > hypervisor rejects the request, leading to a late guest boot failure:
> >
> > KVM-NESTEDv2: couldn't set guest wide elements
> > [..KVM reg dump..]
> >
> > This situation should be detected earlier and rejected by KVM. Without
> > proper validation, if userspace ignores the error, the guest may continue
> > to boot in Power11 raw mode on a Power10 compatibility host, which should
> > not be allowed.
> >
> > Introduce a validation mechanism that detects unsupported arch_compat
> > values early in the guest initialization path. When an unsupported
> > arch_compat is requested (e.g., Power11 on a Power10 compatibility mode
> > host), kvmppc_set_arch_compat() uses cpu_has_feature(CPU_FTR_P11_PVR) to
> > detect the mismatch and sets arch_compat to PVR_ARCH_INVALID. This
> > triggers kvmppc_sanity_check() to mark the vCPU as invalid by setting
> > vcpu->arch.sane to false. On the next vCPU run, kvmppc_vcpu_run_hv()
> > checks this flag and returns -EINVAL, preventing the guest from running
> > with an invalid processor compatibility configuration.
> >
> > With this, when a Power11 arch_compat is requested on a Power10
> > compatibility mode host, the guest fails early during boot with:
> >
> > error: kvm run failed Invalid argument
> >
> > This provides a much clearer failure mode compared to the previous
> > behavior where the guest could boot in Power11 raw mode (if userspace
> > ignored the error) or fail late during H_GUEST_SET_STATE.
> >
> > Suggested-by: Vaibhav Jain <vaibhav@xxxxxxxxxxxxx>
> > Reviewed-by: Vaibhav Jain <vaibhav@xxxxxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx # v6.13+
> > Signed-off-by: Amit Machhiwal <amachhiw@xxxxxxxxxxxxx>
> > ---
> > Changes in v3:
> > * Fixed null pointer dereference in kvmppc_sanity_check(): added check for
> > vcpu->arch.vcore before accessing arch_compat, as vcore is NULL for Book3S
> > PR and BookE guests (only Book3S HV uses vcore) [Reported by Sashiko AI]
> > * Added Reviewed-by tag from Vaibhav
> >
> > Changes in v2:
> > * Fixed issue where v1 allowed guest to boot in Power11 raw mode when
> > userspace ignored the error, by adding validation in kvmppc_sanity_check()
> > to ensure early failure during vCPU run [Found the issue after posting v1,
> > also reported by Gautam.]
>
> Would be nice if we could post the matrix test results which Gautam
> posted earlier with this v3. I guess you meant you already tested all of
> those - it would be nice if we could explicitely put that info in the changelog.
Regarding the test matrix: Both Anushree and I have tested all the
scenarios comprehensively. Anushree has shared her detailed test results
in her reply to the patch [1], covering:
- P11 guest on P11 host -> Works
- P10 guest on P11 host -> Works
- P11 guest on compat-P10 host -> Correctly fails with "Invalid argument"
- P10 guest on compat-P10 host -> Works
I have also verified all these scenarios with the same results.
If you'd prefer that I add the test matrix explicitly to the changelog
and send a new version, please let me know and I'll be happy to do so.
[1] https://lore.kernel.org/all/4e833bec-67ae-4b78-9a50-e1b9dec4029a@xxxxxxxxxxxxx/
>
> > * Introduced PVR_ARCH_INVALID constant for marking invalid arch_compat
> > * Dropped all Reviewed-by and Tested-by tags due to code changes; requesting
> > fresh reviews
> > * v1: https://lore.kernel.org/all/20260603141539.47620-1-amachhiw@xxxxxxxxxxxxx/
> >
> > Changes in v1:
> > * Moved this patch out of the v3 series [1] as discussed here [2]
> > * Addressed below review comments from Ritesh:
> > - Based the PVR validation on cpu features
> > - Fixed hcall name typo
> > - Stable backport
> >
> > [1] https://lore.kernel.org/all/20260522152744.55251-1-amachhiw@xxxxxxxxxxxxx/
> > [2] https://lore.kernel.org/all/20260522152744.55251-2-amachhiw@xxxxxxxxxxxxx/
> > ---
> > arch/powerpc/include/asm/reg.h | 1 +
> > arch/powerpc/kvm/book3s_hv.c | 15 ++++++++++++++-
> > arch/powerpc/kvm/powerpc.c | 4 ++++
> > 3 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> > index 3449dd2b577d..7472b9522f71 100644
> > --- a/arch/powerpc/include/asm/reg.h
> > +++ b/arch/powerpc/include/asm/reg.h
> > @@ -1356,6 +1356,7 @@
> > #define PVR_ARCH_300 0x0f000005
> > #define PVR_ARCH_31 0x0f000006
> > #define PVR_ARCH_31_P11 0x0f000007
> > +#define PVR_ARCH_INVALID 0xffffffff
>
> Logical processor version is defined as part of the PAPR spec. We should
> ensure that this invalid PVR is also documented in the PAPR spec.
>
> If you have already taken care of that, then please confirm and feel free to add:
Regarding the PAPR specification documentation: The PAPR spec documents
the valid Processor Version Register (PVR) values for each processor
generation (POWER8, POWER9, POWER10, POWER11, etc.). However, the
PVR_ARCH_INVALID value (0xffffffff) introduced in this patch series is a
KVM implementation detail used internally to mark invalid compatibility
mode requests - it's not an architectural value that would be defined in
PAPR itself.
The validation logic and the use of PVR_ARCH_INVALID as a sentinel value
are documented in the kernel code and commit message.
Please let me know if this addresses your concern, or if you'd like me
to add specific documentation.
>
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx>
Thanks you again for taking the time out of the review.
~Amit
>