Re: [PATCH v2 00/21] Runtime TDX Module update support
From: Vishal Annapurve
Date: Sun Oct 26 2025 - 18:02:16 EST
On Sun, Oct 26, 2025 at 2:30 PM <dan.j.williams@xxxxxxxxx> wrote:
>
> Vishal Annapurve wrote:
> > On Fri, Oct 24, 2025 at 6:42 PM <dan.j.williams@xxxxxxxxx> wrote:
> > >
> > > Vishal Annapurve wrote:
> > > > On Fri, Oct 24, 2025 at 2:19 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> > > > >
> > > > > On 10/24/25 14:12, dan.j.williams@xxxxxxxxx wrote:
> > > > > >> The SGX solution, btw, was to at least ensure forward progress (CPUSVN
> > > > > >> update) when the last enclave goes away. So new enclaves aren't
> > > > > >> *prevented* from starting but the window when the first one starts
> > > > > >> (enclave count going from 0->1) is leveraged to do the update.
> > > > > > The status quo does ensure forward progress. The TD does get built and
> > > > > > the update does complete, just the small matter of TD attestation
> > > > > > failures, right?
> > > >
> > > > I would think that it's not a "small" problem if confidential
> > > > workloads on the hosts are not able to pass attestation.
> > >
> > > "Small" as in "not the kernel's problem". Userspace asked for the
> > > update, update is documented to clobber build sometimes, userspace ran
> > > an update anyway. Userspace asked for the clobber.
> > >
> > > It would be lovely if this clobbering does not happen at all and the
> > > update mechanism did not come with this misfeature. Otherwise, the kernel
> > > has no interface to solve that problem. The best it can do is document
> > > that this new update facility has this side effect.
> >
> > In this case, host kernel has a way to ensure that userspace can't
> > trigger such clobbering at all.
>
> Unless the clobber condition can be made atomic with respect to update
> so that both succeed, the kernel needs to punt the syncrhonization
> problem to userspace.
>
> A theoretical TDX Module change could ensure that atomicity.
IIUC TDX module already supports avoiding this clobber based on the
TDH.SYS.SHUTDOWN documentation from section 5.4.73 of TDX ABI Spec
[1].
Host kernel needs to set bit 16 of rcx when invoking TDH.SYS.SHUTDOWN
is available.
"If supported by the TDX Module, the host VMM can set the
AVOID_COMPAT_SENSITIVE flag to request the TDX Module to fail
TDH.SYS.UPDATE if any of the TDs are currently in a state that is
impacted by the update-sensitive cases."
I think the above documentation should replace TDH.SYS.UPDATE with
TDH.SYS.SHUTDOWN IIUC.
[1] https://cdrdv2.intel.com/v1/dl/getContent/733579
> A
> theoretical change to the kernel's build ABI could effect that as well,
> or notify the collision. I.e. a flag at the finalization stage that an
> update happened during the build sequence needs a restart. This is the
> role of "generation" in the tsm_report ABI. As far as I understand
> userspace just skips that ABI and arranges for userspace synchronized
> access to tsm_report.
>