Re: [PATCH v13 00/48] arm64: Support for Arm CCA in KVM

From: Suzuki K Poulose

Date: Thu Apr 16 2026 - 07:09:25 EST

On 16/04/2026 00:27, Alper Gun wrote:

On Wed, Apr 15, 2026 at 4:01 AM Steven Price <steven.price@xxxxxxx> wrote:

On 14/04/2026 22:40, Alper Gun wrote:

On Wed, Mar 18, 2026 at 8:54 AM Steven Price <steven.price@xxxxxxx> wrote:

This series adds support for running protected VMs using KVM under the
Arm Confidential Compute Architecture (CCA).

New major version number! This now targets RMM v2.0-bet0[1]. And unlike
for Linux this represents a significant change.

RMM v2.0 brings with it the ability to configure the RMM to have the
same page size as the host (so no more RMM_PAGE_SIZE and dealing with
granules being different from host pages). It also introduces range
based APIs for many operations which should be more efficient and
simplifies the code in places.

The handling of the GIC has changed, so the system registers are used to
pass the GIC state rather than memory. This means fewer changes to the
KVM code as it looks much like a normal VM in this respect.

And of course the new uAPI introduced in the previous v12 posting is
retained so that also remains simplified compared to earlier postings.

The RMM support for v2.0 is still early and so this series includes a
few hacks to ease the integration. Of note are that there are some RMM
v1.0 SMCs added to paper over areas where the RMM implementation isn't
quite ready for v2.0, and "SROs" (see below) are deferred to the final
patch in the series.

The PMU in RMM v2.0 requires more handling on the RMM-side (and
therefore simplifies the implementation on Linux), but this isn't quite
ready yet. The Linux side is implemented (but untested).

PSCI still requires the VMM to provide the "target" REC for operations
that affect another vCPU. This is likely to change in a future version
of the specification. There's also a desire to force PSCI to be handled
in the VMM for realm guests - this isn't implemented yet as I'm waiting
for the dust to settle on the RMM interface first.

Stateful RMI Operations
-----------------------

The RMM v2.0 spec brings a new concept of Stateful RMI Operations (SROs)
which allow the RMM to complete an operation over several SMC calls and
requesting/returning memory to the host. This has the benefit of
allowing interrupts to be handled in the middle of an operation (by
returning to the host to handle the interrupt without completing the
operation) and enables the RMM to dynamically allocate memory for
internal tracking purposes. One example of this is RMI_REC_CREATE no
longer needs "auxiliary granules" provided upfront but can request the
memory needed during the RMI_REC_CREATE operation.

There are a fairly large number of operations that are defined as SROs
in the specification, but current both Linux and RMM only have support
for RMI_REC_CREATE and RMI_REC_DESTROY. There a number of TODOs/FIXMEs
in the code where support is missing.

Given the early stage support for this, the SRO handling is all confined
to the final patch. This patch can be dropped to return to a pre-SRO
state (albeit a mixture of RMM v1.0 and v2.0 APIs) for testing purposes.

A future posting will reorder the series to move the generic SRO support
to an early patch and will implement the proper support for this in all
RMI SMCs.

One aspect of SROs which is not yet well captured is that in some
circumstances the Linux kernel will need to call an SRO call in a
context where memory allocation is restricted (e.g. because a spinlock
is held). In this case the intention is that the SRO will be cancelled,
the spinlock dropped so the memory allocation can be completed, and then
the SRO restarted (obviously after rechecking the state that the
spinlock was protecting). For this reason the code stores the memory
allocations within a struct rmi_sro_state object - see the final patch
for more details.

This series is based on v7.0-rc1. It is also available as a git
repository:

https://gitlab.arm.com/linux-arm/linux-cca cca-host/v13

Hi Steven,

I have a question regarding host kexec and kdump scenarios, and
whether there is any plan to make them work in this initial series.

Intel TDX and AMD SEV-SNP both have a firmware shutdown command that
is invoked during the kexec or panic code paths to safely bypass
hardware memory protections and boot into the new kernel. As far as
I know, there is no similar global teardown command available for
the RMM.

Correct, the RMM specification as it stands doesn't provide a mechanism
for the host to do this. The host would have to identify all the realm
guests in the system: specifically the address of the RDs (Realm
Descriptors) and RECs (Realm Execution Contexts). It needs this to tear
down the guests and be able to undelegate the memory.

It's an interesting point and I'll raise the idea of a "firmware
shutdown command" to make this more possible.

What is the roadmap for supporting both general kexec and
more specifically kdump (panic) scenarios with CCA?

I don't have a roadmap I'm afraid for these. kexec in theory would be
possible with KVM gracefully terminating all realms. For kdump/panic
that sort of graceful shutdown isn't really appropriate (or likely to
succeed).

Thanks Steven for the clarification.

For us, kdump is highly critical as it is our primary diagnostic tool
for host crashes. Without it, monitoring and debugging at fleet scale
would become unmanageable.

To confirm my understanding of the current architecture: if a host
panics while no Realms are actively running (and therefore no pages
are currently in the delegated state), the standard kdump extraction
should work perfectly fine without any modifications, correct?

This may not be true. We could have pages donated to RMM for GPT,
Tracking etc. So, unless Linux keeps track of them, it may be
unsafe for a crash kernel to access them.

Regarding the KVM tracking structures (RDs, RECs, RTTs, etc.) when VMs
are running, perhaps we could use `vmcoreinfo` to export the physical
addresses of these delegated pages. This would allow tools like

Thinking of this, do we really need to ? We could access the pages from
"vmcore" read and handle the GPFs for such accesses and give out 0s
for the Granules. Anyways, we can't get access to the data on those
pages that are still in Realm PAS.

`makedumpfile` to explicitly filter them out. I assume these pages must
remain hardware-locked while the VMs are active.

Long-term, having an architectural shutdown command - similar to the
TDH.SYS.DISABLE command in Intel TDX - would be incredibly useful. It
would allow the kdump kernel to safely bypass these hardware security
checks, especially when extracting host-side KVM state.

For kexec, may be we could do this. Alternatively we could try to
reclaim everything back, (GPTs, Tracking) before kexec-reboot.

As for the protected realm memory, I assume that is an easier problem.
We naturally want to exclude guest pages from a host dump regardless
of whether they are Realm pages or not. However, accidental touches
are still fatal.

There is also some RMM configuration which cannot be repeated (see
RMI_RMM_CONFIG_SET) - which implies that the kexec kernel must be
similar to the first kernel (i.e. same page size).

That is true, the page sizes must match. RMM spec is updated to probe the state of the RMM and detect if it can do the CONFIG_SET

Suzuki

Thanks,
Steve