Re: [PATCH 2/2] x86/tdx: Accept hotplugged memory before online

From: Pratik R. Sampat

Date: Mon Mar 30 2026 - 11:13:06 EST




On 3/26/26 4:40 PM, Edgecombe, Rick P wrote:
> Hi Paolo!
>
> On Thu, 2026-03-26 at 19:25 +0100, Paolo Bonzini wrote:
>>> Another option could be to perform a TDG.MEM.PAGE.RELEASE TDCALL from
>>> the guest when it unplugs the memory, to put it in an unaccepted state.
>>> This would be more robust to buggy VMM behavior. But working around
>>> buggy VM behavior would need a high bar.
>>
>> Wouldn't it actually be a very low bar? Just from these two paragraphs
>> of yours, it's clear that the line between buggy and malicious is
>> fine, in fact I think userspace should not care at all about removing
>> the memory. Only the guest cares about acceptance state.
>>
>> Doing a RELEASE TDCALL seems more robust and not hard.
>
> I mean I guess the contract is a bit fuzzy. The reason why I was thinking it was
> a host userspace bug is because the conventional bare metal behavior of
> unplugging memory should be that it is no longer accessible, right? If the guest
> could still use the unplugged memory, it could be surprising for userspace and
> the guest. Also, ideally I'd think the behavior wouldn't cover up guest bugs
> where it tried to keep using the memory. So forgetting about TDX, isn't it
> better behavior in general for unplugging memory, to actually pull it from the
> guest? Did I look at that wrong?
>
> As for the bar to change the guest, I was first imagining it would be the size
> of the accept memory plumbing. Which was not a small effort and has had a steady
> stream of bugs to squash where the accept was missed.
>
> But I didn't actually POC anything to check the scope so maybe that was a bit
> hasty. Should we do a POC? But considering the scope, I wonder if SNP has the
> same problem.

SNP likely has an analogous issue too.
Failing to switch states on remove will cause that RMP entry to remain
validated. A malicious hypervisor could then remap this GPA to another HPA
which would put this in the Guest-Invalid state. On re-hotplug if we ignore
errors suggested by Patch 1 (in our case that'd be PVALIDATE_FAIL_NOUPDATE
error likely), we could have two RMP entries for the same GPA and both being
validated. This is dangerous because hypervisor could swap these at will.

Would it not be better to have this information in the unaccepted bitmap which
we could explicitly query to accept/unaccept?

For ACPI hardware-style hotplug I was working with the UEFI side on a POC to
reflect SRAT hotplug windows in UEFI_UNACCEPTED_MEMORY using
EFI_MEMORY_HOT_PLUGGABLE attribute and working to modify that spec. I’m less
sure what this description for virtio-mem would look like and if it'd be
possible to do this early-boot.

Thanks,
--Pratik