Re: [RFC v3 00/27] lib: Rust implementation of SPDM

From: dan.j.williams

Date: Thu Mar 05 2026 - 14:49:59 EST


Jason Gunthorpe wrote:
> On Wed, Mar 04, 2026 at 08:17:24PM -0800, dan.j.williams@xxxxxxxxx wrote:
>
> > So I ended up dropping this bit of the proposal because there is no need
> > for the kernel to be involved in any decision about the validity and
> > sufficiency of device evidence. Userspace has everything it needs to
> > make that initial determination. "Authenticated" simply means "evidence
> > ready".
>
> Right, this caching was only for the automatic device recovery flow.
>
> I think the kernel still probably needs to check the signed nonce
> against the device's public key as part of the SPDM handshake, but it
> doesn't need to do any validation of the public key.

Yes, and I think this is a subtle detail that I was failing to grok /
communicate previously. Of course the native SPDM implementation needs
to be able to perform a challenge response to establish the session. All
of the platform TSMs also internally carry out that protocol. The
difference with platform TSMs being that the kernel only sees the
resulting evidence.

So when / if the PCI/TSM evidence implementation grows kernel-internal
revalidate support it will be growing something like pdev->tsm->pubkey
that is installed / cached "after the fact" for platform TSMs. For the
native SPDM driver, pdev->tsm->pubkey can be installed at 'struct
pci_tsm_ops::connect()' time because it already has it parsed for its
own internal purposes.

> > Automatic device recovery into the TCB is a separate concern that needs
> > to be prepared to handle more than just "is this device able to generate
> > a fresh signature with the same cert chain that userspace saw before".
> > Yes, that is a minimal requirement but not sufficient for many cases.
> > For example cases that want to validate measurements, interface reports,
> > or opt-out of recovery because SPDM session loss is fatal.
>
> Yeah, but it is a pretty good starting point :)
>
> > Authenticate a device
> > =====================
> > Look in /sys/class/tsm to find a tsmN device which will be either an
> > instance associated with native kernel PCI CMA/SPDM or a platform tsm
> > like the one provided by AMD SEV-TIO, ARM CCA, Intel TDX, etc...
> >
> > echo tsmN > /sys/bus/pci/devices/$device/tsm/connect
> >
> > Once that succeeds the PCI/TSM evidence netlink interface is available
> > to dump any signatures created during that session establishment.
> >
> > After userspace is happy with that evidence it can bind a driver.
> >
> > If running in a confidential VM where the TSM driver is capable of
> > securing more than just an SPDM session the interface is:
> >
> > echo tsmN > /sys/bus/pci/devices/$device/tsm/lock
> >
> > Similar evidence can be collected, and when userspace is happy with it
> > it can accept the device:
> >
> > echo 1 > /sys/bus/pci/devices/$device/tsm/accept
> >
> > ...and bind a driver.
>
> Makes sense to me, userspace can figure out what flow to use.
>
> > Auto-recover device (future work)
> > =================================
> > By default, devices fall out of the TCB on recovery events for the TDISP
> > case and need userspace to rerun the lock and accept flow. For the
> > native SPDM CMA case the default is that the kernel continues to operate
> > the device post recovery and only userspace polling could detect device
> > replacement.
>
> Even with SPDM the kernel should know if the SPDM session has to be
> restarted right? It could squirt out a netlink multicast message, or a
> uevent on these events, so the polling is not baked into the
> architecture?

Right, "SPDM session lost" is a reasonable notification to put into the
interface and "future work" bucket. The polling comment was with respect
to the near term limitation of this minimally viable first step.

> > To go beyond those defaults the kernel needs userpsace to tell it how to
> > re-validate the device. I think that can be as simple as a netlink
> > message to store hashes of cert chains or measurements and then use
> > those in a new challenge / response with the device with a kernel
> > decided nonce.
>
> Yeah, I see several reasonable options:

+1 to these.

> 1) Above, kernel is informed of hashes and does an exact content
> compare
>
> 2) The driver operating the device has a built in same-device policy
> and deep parses the evidence for equivilance after the private key
> is validated, carefully avoiding volatile evidence bytes.

Yes, Lukas was also pointing out that the existing driver reset/resume
handlers could signal "revalidated" as well if the PCI layer had no
opinion.

...but the point is that it is a menu of options, not a static policy.

> 3) User provides a BPF program and it runs on the evidence
>
> 4) Standards bodies define a generic "same device check" algorithm for
> the evidence and core kernel just does this for compatible devices
>
> > The equivalent behavior to what is present in this SPDM proposal is
> > extend drivers/pci/tsm/evidence.c to add a netlink operation that tells
> > the kernel to cache the public-key and challenge the device regenerate a
> > valid signature. Then plumb all the recovery paths to call a new
> > 'struct pci_tsm_ops::revalidate()' operation in all the same places
> > where this patch set wants to reauthenticate. Later when something more
> > sophisticated than "challenge the device to create a signature" comes
> > along it can reuse those revalidate() hooks.
>
> That's a nice simple starting point.

Appreciate it. Still want to close the loop with Lukas to make sure
everything he wants to do for native SPDM can fit in this scheme, but I
am feeling increasingly confident.