[RFC PATCH v2 0/4] arm64: Add PSCI v1.3 SYSTEM_OFF2 support for hibernation

From: David Woodhouse
Date: Mon Mar 18 2024 - 12:48:26 EST


The PSCI v1.3 spec (https://developer.arm.com/documentation/den0022,
currently in Alpha state, hence 'RFC') adds support for a SYSTEM_OFF2
function enabling a HIBERNATE_OFF state which is analogous to ACPI S4.
This will allow hosting environments to determine that a guest is
hibernated rather than just powered off, and ensure that they preserve
the virtual environment appropriately to allow the guest to resume
safely (or bump the hardware_signature in the FACS to trigger a clean
reboot instead).

This adds support for it to KVM, exactly the same way as the existing
support for SYSTEM_RESET2 as added in commits d43583b890e7 ("KVM: arm64:
Expose PSCI SYSTEM_RESET2 call to the guest") and 34739fd95fab ("KVM:
arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field").

Back then, KVM was unconditionally bumped to expose PSCI v1.1. This
means that a kernel upgrade causes guest visible behaviour changes
without any explicit opt-in from the VMM, which is... unconventional. In
some cases, a PSCI update isn't just about new optional calls; PSCI v1.2
for example adds a new permitted error return from the existing CPU_ON
function.

There *is* a way for a VMM to opt *out* of newer PSCI versions... by
setting a per-vCPU "special" register that actually ends up setting the
PSCI version KVM-wide. Quite why this isn't just a simple KVM_CAP, I
have no idea. There *is* a KVM_CAP_ARM_PSCI_0_2 but that's just for 0.1
vs. 0.2+, not the specific v0.2+ version that's exposed.

Since the SYSTEM_OFF2 call is optional and discoverable through the
PSCI_FEATURES call, I'm electing not to touch the PSCI versioning
awfulness at all. Like the existing SYSTEM_RESET2, there's a KVM_CAP to
enable it explicitly (as it's an optional call even in v1.3), and like
the existing SYSTEM_RESET2 it doesn't depend on the advertised PSCI
version.

For the guest side, add a new SYS_OFF_MODE_POWER_OFF handler with higher
priority than the EFI one, but which *only* triggers when there's a
hibernation in progress. There are other ways to do this (see the commit
message for more details) but this seemed like the simplest.

Version 2 of the patch series splits out the psci.h definitions into a
separate commit (a dependency for both the guest and KVM side), and adds
definitions for the other new functions added in v1.3. It also moves the
pKVM psci-relay support to a separate commit; although in arch/arm64/kvm
that's actually about the *guest* side of SYSTEM_OFF2 (i.e. using it
from the host kernel, relayed through nVHE).

David Woodhouse (4):
firmware/psci: Add definitions for PSCI v1.3 specification (ALPHA)
KVM: arm64: Add PSCI SYSTEM_OFF2 function for hibernation
KVM: arm64: nvhe: Pass through PSCI v1.3 SYSTEM_OFF2 call
arm64: Use SYSTEM_OFF2 PSCI call to power off for hibernate

Documentation/virt/kvm/api.rst | 11 +++++++++++
arch/arm64/include/asm/kvm_host.h | 2 ++
arch/arm64/include/uapi/asm/kvm.h | 6 ++++++
arch/arm64/kvm/arm.c | 5 +++++
arch/arm64/kvm/hyp/nvhe/psci-relay.c | 2 ++
arch/arm64/kvm/psci.c | 37 ++++++++++++++++++++++++++++++++++++
drivers/firmware/psci/psci.c | 35 ++++++++++++++++++++++++++++++++++
include/uapi/linux/kvm.h | 1 +
include/uapi/linux/psci.h | 20 +++++++++++++++++++
kernel/power/hibernate.c | 5 ++++-
10 files changed, 123 insertions(+), 1 deletion(-)