Re: [PATCH v7 10/22] x86/virt/seamldr: Abort updates if errors occurred midway

Next message: Hillf Danton: "Re: [syzbot] [dri?] KASAN: slab-use-after-free Read in drm_gem_object_release_handle"
Previous message: Baolin Liu: "[PATCH v1] ext4: add mb_stats_clear for mballoc statistics"
In reply to: Edgecombe, Rick P: "Re: [PATCH v7 10/22] x86/virt/seamldr: Abort updates if errors occurred midway"
Next in thread: Edgecombe, Rick P: "Re: [PATCH v7 10/22] x86/virt/seamldr: Abort updates if errors occurred midway"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Chao Gao

Date: Tue Apr 14 2026 - 06:06:42 EST

On Sat, Apr 11, 2026 at 09:26:58AM +0800, Edgecombe, Rick P wrote:
>On Tue, 2026-03-31 at 05:41 -0700, Chao Gao wrote:
>> The TDX module update process has multiple steps, each of which may
>> encounter failures.
>>
>> The current state machine of updates proceeds to the next step regardless
>> of errors. But continuing updates when errors occur midway is pointless.
>
>This kind of begs the question of how much it matters if some pointless work
>happens in error condition during a rare operation. I'm thinking at this point,
>aha!, do we need this?
>
>>
>> Abort the update by setting a flag to indicate that a CPU has encountered
>> an error, forcing all CPUs to exit the execution loop. Note that failing
>> CPUs do not acknowledge the current step. This keeps all other CPUs waiting
>> in the current step (since advancing to the next step requires all CPUs to
>> acknowledge the current step) until they detect the fault flag and exit the
>> loop.
>
>So is the point of the patch to prevent the operation from getting stuck? Or
>saving the user experiencing a failed update a little time?

Good question.

The main point is correctness, not saving time.

If shutdown fails midway, the update is still recoverable — TDs can continue
running. But if we proceed to seamldr.install anyway, it becomes destructive.
Aborting early on shutdown failure preserves recoverability (this is needed to
handle races between updates and TD build/migration).

If seamldr.install itself fails, it's already destructive, so aborting early
there just saves time. But using the same abort mechanism for both keeps the
error handling uniform.