Re: [RFC v3 00/27] lib: Rust implementation of SPDM

From: dan.j.williams

Date: Thu Feb 19 2026 - 15:07:56 EST


Jason Gunthorpe wrote:
> On Thu, Feb 19, 2026 at 04:07:58PM +0100, Lukas Wunner wrote:
> > On Thu, Feb 19, 2026 at 10:31:29AM -0400, Jason Gunthorpe wrote:
> > > On Thu, Feb 19, 2026 at 03:15:34PM +0100, Lukas Wunner wrote:
> > > > The way this works in my series (and I presume Alistair's) is that
> > > > trusted root certificates for devices need to be added to the .cma
> > > > keyring.
> > > >
> > > > This can be done from user space using keyctl(1) or some other utility
> > > > that can talk to the kernel's existing keyctl ABI.
> > >
> > > I really don't like this from a verification perspective. We don't
> > > want the kernel checking signatures, that is the verifier's job.
> >
> > On resume from system sleep, the device is put into D0 already in the
> > ->resume_noirq() phase and drivers are free to access it already at
> > that point. However a verifier in user space cannot be queried
> > at that point because user space is still frozen.
> >
> > Likewise after recovery from DPC or AER, the device has been reset
> > and needs to be reauthenticated, yet user space may be unavailable
> > because the device that has been reset may contain the root partition
> > or may be the NIC that you need to query your remote attestation service.
> >
> > There is no way around some form of in-kernel device authentication
> > to accommodate such use cases.
>
> I'm arguing there are two very different steps here that must be kept
> separate. Verification is done when the device is first seen and the
> kernel is told it is OK to use the device.
>
> A same-device check is performed if an already verified and accepted
> device resumes or RAS's in some way.
>
> same-device does not mean run a kernel verification against the kernel
> keyring, as a second verification could be tricked into accepting
> something that has changed and defeat the userspace verifier.
>
> Instead the implementation should capture information when the device
> is accepted by the verifier and on resume/RAS it should compare the
> device against that captured information and determine if it is still
> the same device.
>
> The key north star must be that the userspace verifier alone decides
> if the device is acceptable and if the kernel is configured to
> auto-re-accept the device later on RAS/resume it must not permit a
> device that is different from what userspace approved. In other words
> it is not a verification on resume, it is just a kernel side
> confirmation the device hasn't been changed.
>
> Hence no keyring should be involved in resume.

I am also struggling to see a role for the .cma keyring as long as the
kernel eventually has a method to cache cert-chain, measurements, and
for TDISP, interface report digests. Support for a recovery flow is not
the first dragon to slay though as just establishing device trust in the
first instance without RAS concerns is a significant amount of work.

Linux should not over index on native bare-metal CMA because that
mechanism only tells you that the SPDM session with device firmware
(DSM) is authenticated, it does nothing to ensure that the kernel's view
of the device's MMIO is and remains associated with that DSM. Better
than nothing, yes, but it certainly assumes a less sophisticated threat
model than TDISP.

So the current 'authenticated' PCI sysfs attribute can simply indicate
"SPDM collateral (cert chain + measurements) available", and leave all
the decisions about what do with that collateral to userspace. For cases
where the full lock + accept TDISP flow is not available the only policy
knob that userspace has is to decline to attach a driver.

Once we have that userspace can optionally tell the kernel to cache
digests for automatic re-accept / keep driver bound, or userspace can
plan to do another round trip with the verifier for recovery if the
device bounces out of the TCB.

My current thought is take and adapt the netlink interface to retrieve
cert chains, change the certificate slot for the authentication attempt,
and retrieve device measurements. None of that requires the x509 parser.
With that in place native CMA SPDM can be modeled as just another TSM
driver that only addresses a subset of the TDISP threat model.

There are 2 flows depending on whether the TSM driver suports the
comprehensive security of the "lock+accept" model or not:

---

# $tsmN is a class device registered by one of amd-tio, intel-tdx,
# arm-cca, or when this spdm library is available a kernel-pci-cma TSM
# driver

# The "lock+accept" model is not available for any of the current tsm
# drivers on bare metal, only the "connect" flow. The connect flow gets
# you an SPDM secure session at a minimum and optionally IDE as well. It
# is a less comprehensive security model than TDISP
echo $tsmN > /sys/bus/pci/devices/$pci_dev/tsm/connect

# Alternatively, when a TDISP interface is assigned, the TSM driver
# publishes "lock+accept" attributes. This provides the comprehensive
# security model that closes "DSM is and remains associated to device
# MMIO" TOCTOU problem.
echo $tsmN > /sys/bus/pci/devices/$pci_dev/tsm/lock

# In either of the above cases the
# /sys/bus/pci/devices/$pci_dev/authenticated attribute toggles to 1 and
# userspace is able to use PCI netlink to gather evidence with nonces
...collect (with netlink) / validate evidence...

# When verifier is satisfied bind the driver, or in the Confidential
# Computing / TDISP case, first "accept" the device so that it is
# allowed to access private memory
echo 1 > /sys/bus/pci/devices/$pci_dev/tsm/accept
echo $pci_dev > /sys/bus/pci/drivers/$driver/bind

---