Re: [PATCH v13 00/48] arm64: Support for Arm CCA in KVM

From: Alper Gun

Date: Thu Apr 16 2026 - 13:54:15 EST

On Thu, Apr 16, 2026 at 4:05 AM Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:
>
> On 16/04/2026 00:27, Alper Gun wrote:
> > On Wed, Apr 15, 2026 at 4:01 AM Steven Price <steven.price@xxxxxxx> wrote:
> >>
> >> On 14/04/2026 22:40, Alper Gun wrote:
> >>> On Wed, Mar 18, 2026 at 8:54 AM Steven Price <steven.price@xxxxxxx> wrote:
> >>>>
> >>>> This series adds support for running protected VMs using KVM under the
> >>>> Arm Confidential Compute Architecture (CCA).
> >>>>
> >>>> New major version number! This now targets RMM v2.0-bet0[1]. And unlike
> >>>> for Linux this represents a significant change.
> >>>>
> >>>> RMM v2.0 brings with it the ability to configure the RMM to have the
> >>>> same page size as the host (so no more RMM_PAGE_SIZE and dealing with
> >>>> granules being different from host pages). It also introduces range
> >>>> based APIs for many operations which should be more efficient and
> >>>> simplifies the code in places.
> >>>>
> >>>> The handling of the GIC has changed, so the system registers are used to
> >>>> pass the GIC state rather than memory. This means fewer changes to the
> >>>> KVM code as it looks much like a normal VM in this respect.
> >>>>
> >>>> And of course the new uAPI introduced in the previous v12 posting is
> >>>> retained so that also remains simplified compared to earlier postings.
> >>>>
> >>>> The RMM support for v2.0 is still early and so this series includes a
> >>>> few hacks to ease the integration. Of note are that there are some RMM
> >>>> v1.0 SMCs added to paper over areas where the RMM implementation isn't
> >>>> quite ready for v2.0, and "SROs" (see below) are deferred to the final
> >>>> patch in the series.
> >>>>
> >>>> The PMU in RMM v2.0 requires more handling on the RMM-side (and
> >>>> therefore simplifies the implementation on Linux), but this isn't quite
> >>>> ready yet. The Linux side is implemented (but untested).
> >>>>
> >>>> PSCI still requires the VMM to provide the "target" REC for operations
> >>>> that affect another vCPU. This is likely to change in a future version
> >>>> of the specification. There's also a desire to force PSCI to be handled
> >>>> in the VMM for realm guests - this isn't implemented yet as I'm waiting
> >>>> for the dust to settle on the RMM interface first.
> >>>>
> >>>> Stateful RMI Operations
> >>>> -----------------------
> >>>>
> >>>> The RMM v2.0 spec brings a new concept of Stateful RMI Operations (SROs)
> >>>> which allow the RMM to complete an operation over several SMC calls and
> >>>> requesting/returning memory to the host. This has the benefit of
> >>>> allowing interrupts to be handled in the middle of an operation (by
> >>>> returning to the host to handle the interrupt without completing the
> >>>> operation) and enables the RMM to dynamically allocate memory for
> >>>> internal tracking purposes. One example of this is RMI_REC_CREATE no
> >>>> longer needs "auxiliary granules" provided upfront but can request the
> >>>> memory needed during the RMI_REC_CREATE operation.
> >>>>
> >>>> There are a fairly large number of operations that are defined as SROs
> >>>> in the specification, but current both Linux and RMM only have support
> >>>> for RMI_REC_CREATE and RMI_REC_DESTROY. There a number of TODOs/FIXMEs
> >>>> in the code where support is missing.
> >>>>
> >>>> Given the early stage support for this, the SRO handling is all confined
> >>>> to the final patch. This patch can be dropped to return to a pre-SRO
> >>>> state (albeit a mixture of RMM v1.0 and v2.0 APIs) for testing purposes.
> >>>>
> >>>> A future posting will reorder the series to move the generic SRO support
> >>>> to an early patch and will implement the proper support for this in all
> >>>> RMI SMCs.
> >>>>
> >>>> One aspect of SROs which is not yet well captured is that in some
> >>>> circumstances the Linux kernel will need to call an SRO call in a
> >>>> context where memory allocation is restricted (e.g. because a spinlock
> >>>> is held). In this case the intention is that the SRO will be cancelled,
> >>>> the spinlock dropped so the memory allocation can be completed, and then
> >>>> the SRO restarted (obviously after rechecking the state that the
> >>>> spinlock was protecting). For this reason the code stores the memory
> >>>> allocations within a struct rmi_sro_state object - see the final patch
> >>>> for more details.
> >>>>
> >>>> This series is based on v7.0-rc1. It is also available as a git
> >>>> repository:
> >>>>
> >>>> https://gitlab.arm.com/linux-arm/linux-cca cca-host/v13
> >>>>
> >>>>
> >>>
> >>> Hi Steven,
> >>>
> >>> I have a question regarding host kexec and kdump scenarios, and
> >>> whether there is any plan to make them work in this initial series.
> >>>
> >>> Intel TDX and AMD SEV-SNP both have a firmware shutdown command that
> >>> is invoked during the kexec or panic code paths to safely bypass
> >>> hardware memory protections and boot into the new kernel. As far as
> >>> I know, there is no similar global teardown command available for
> >>> the RMM.
> >>
> >> Correct, the RMM specification as it stands doesn't provide a mechanism
> >> for the host to do this. The host would have to identify all the realm
> >> guests in the system: specifically the address of the RDs (Realm
> >> Descriptors) and RECs (Realm Execution Contexts). It needs this to tear
> >> down the guests and be able to undelegate the memory.
> >>
> >> It's an interesting point and I'll raise the idea of a "firmware
> >> shutdown command" to make this more possible.
> >>
> >>> What is the roadmap for supporting both general kexec and
> >>> more specifically kdump (panic) scenarios with CCA?
> >>
> >> I don't have a roadmap I'm afraid for these. kexec in theory would be
> >> possible with KVM gracefully terminating all realms. For kdump/panic
> >> that sort of graceful shutdown isn't really appropriate (or likely to
> >> succeed).
> >>
> >
> > Thanks Steven for the clarification.
> >
> > For us, kdump is highly critical as it is our primary diagnostic tool
> > for host crashes. Without it, monitoring and debugging at fleet scale
> > would become unmanageable.
> >
> > To confirm my understanding of the current architecture: if a host
> > panics while no Realms are actively running (and therefore no pages
> > are currently in the delegated state), the standard kdump extraction
> > should work perfectly fine without any modifications, correct?
>
> This may not be true. We could have pages donated to RMM for GPT,
> Tracking etc. So, unless Linux keeps track of them, it may be
> unsafe for a crash kernel to access them.
>
> >
> > Regarding the KVM tracking structures (RDs, RECs, RTTs, etc.) when VMs
> > are running, perhaps we could use `vmcoreinfo` to export the physical
> > addresses of these delegated pages. This would allow tools like
>
> Thinking of this, do we really need to ? We could access the pages from
> "vmcore" read and handle the GPFs for such accesses and give out 0s
> for the Granules. Anyways, we can't get access to the data on those
> pages that are still in Realm PAS.
>

I like the idea of handling the GPFs directly during vmcore reads for
kdump case. That's much simpler\cleaner solution.

> > `makedumpfile` to explicitly filter them out. I assume these pages must
> > remain hardware-locked while the VMs are active.
>
>
>
> >
> > Long-term, having an architectural shutdown command - similar to the
> > TDH.SYS.DISABLE command in Intel TDX - would be incredibly useful. It
> > would allow the kdump kernel to safely bypass these hardware security
> > checks, especially when extracting host-side KVM state.
>

> For kexec, may be we could do this. Alternatively we could try to
> reclaim everything back, (GPTs, Tracking) before kexec-reboot.
>

Agreed. Reclaiming all delegated memory prior to the kexec reboot
makes perfect sense.

> >
> > As for the protected realm memory, I assume that is an easier problem.
> > We naturally want to exclude guest pages from a host dump regardless
> > of whether they are Realm pages or not. However, accidental touches
> > are still fatal.
> >
> >> There is also some RMM configuration which cannot be repeated (see
> >> RMI_RMM_CONFIG_SET) - which implies that the kexec kernel must be
> >> similar to the first kernel (i.e. same page size).
>
> That is true, the page sizes must match. RMM spec is updated to probe
> the state of the RMM and detect if it can do the CONFIG_SET
>
> Suzuki
>
> >>
> >> Thanks,
> >> Steve
>