Re: [PATCH] x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

From: Dave Hansen
Date: Fri Sep 08 2023 - 18:24:32 EST


On 9/8/23 12:25, Sagi Shahar wrote:
> On Fri, Sep 8, 2023 at 11:00 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>>
>> On 9/8/23 10:56, Sagi Shahar wrote:
>>> The current TDX module does not handle extended topology leaves
>>> explicitly and will generate a #VE but the current #VE handler
>>> implementation blindly returns 0 for those CPUID leaves.
>>>
>>> This currently causes TDX guests to see 0 values when querying the numa
>>> topology leading to incorrect numa configurations.
>>>
>>> This patch fixes this behavior by emulating the extended topology leaves
>>> using the CPUID hypercall.
>>
>> ... and thus acquires the data from the untrusted VMM. Right?
>>
>> What are the security implications of consuming this untrusted data?
>
> The topology information is mostly used for performance optimizations
> on the guest side. I don't see any security implications if VMM passes
> incorrect values.

Oh, so I take it you did an audit and checked that no data structures
are sized or accessed based on this information.

Could you share some of your analysis, please? I'd love to see it in
the changelog.

> Right now, the guest is already using the returned 0 values and gets
> an incorrect numa topology leading to odd behavior in the guest. If we
> allow guests to read these values from the untrusted VMM and VMM
> spoofs the values, the worst that can happen is a different incorrect
> numa topology instead of the incorrect one we already have today.

I'm going to have to disagree with this logic a wee bit.

The 0's have *KNOWN*, *FIXED* behavior. It may stink, but it's known
and fixed and thoroughly unexploitable. If it were somehow exploited,
we would change it to another value that we control.

The VMM values are fundamentally different. They're dynamic and can be
maliciously crafted.

I'm actually not even sure how you're getting bad "NUMA" data out of
these leaves. The check_extended_topology_leaf() checks should fail
because ecx will be all 0's and SMT_TYPE==1, so the (LEAFB_SUBTYPE(ecx)
!= SMT_TYPE) condition will always hit.

I'm also not sure where the "numa topology" comment comes from. There's
a _lot_ of topology information in these leaves that's at a much finer
granularity than NUMA nodes. If you're getting errors on the console,
it would be spectacular to share those.

I have all kinds of guesses about what trouble this might be causing
you, but I'm a bad guesser.