Re: [PATCHv7 00/14] mm, x86/cc: Implement support for unaccepted memory

From: Marc Orr
Date: Wed Jul 20 2022 - 13:03:58 EST


On Tue, Jul 19, 2022 at 10:44 PM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Tue, Jul 19, 2022 at 05:26:21PM -0700, Marc Orr wrote:
> > These feature tags are a mess to keep track of.
>
> Well, looking at those tags, it doesn't look like you'll stop using them
> anytime soon.
>
> And once all the required SNP/TDX features are part of the guest image,
> - including unaccepted memory - if anything, you'll have less tags.
>
> :-)

Yeah, once all of the features are a part of the guest image AND any
older images with SNP/TDX minus the features are deprecated. I agree.

> > - Do we anticipate (many) more features for confidential compute in
> > the future that require code in both the guest FW and guest kernel? If
> > yes, then designing a FW-kernel feature negotiation could be useful
> > beyond this situation.
>
> Good question.
>
> > - Dave's suggestion to "2. Boot some intermediate thing like a
> > bootloader that does acceptance ..." is pretty clever! So if upstream
> > thinks this FW-kernel negotiation is not a good direction, maybe we
> > (Google) can pursue this idea to avoid introducing yet another tag on
> > our images.
>
> Are those tags really that nasty so that you guys are looking at
> upstream changes just to avoid them?

Generally, no. But the problem with tags is that distros tag their
images wrong sometimes. And that leads to problems. For example, I
just got a bug assigned to me yesterday about some ARM image tagged as
SEV_CAPABLE. Oops. Lol :-). (Though, I'm pretty sure we won't try to
boot an ARM image on a non-ARM host anyway; but it's still wrong...)

That being said, this lazy accept problem is sort of a special case,
since it requires deploying code to the guest FW and the guest kernel.
I'm still relatively new at all of this, but other than the
SNP/TDX-enlightenment patches themselves, I haven't really seen any
other examples of this. So that goes back to my previous question. Is
this going to happen a lot more? If not, I can definitely see value in
the argument to skip the complexity of the FW/kernel feature
negotiation.

Another thing I thought of since my last reply, that's mostly an
internal solution to this problem on our side: Going back to Dave's
10k-foot view of the different angles of how to solve this. For "1.
Deal with that at the host level configuration", I'm thinking we could
tag the images with their internal guest kernel version. For example,
if an image has a 5.15 kernel, then we could have a `KERNEL_5_15` tag.
This would then allow us to have logic in the guest FW like:

if (guest_kernel_is_at_least(/*major=*/5, /*minor=*/15)
enable_lazy_accept = true;

One detail I actually missed in all of this, is how the guest image
tag gets propagated into the guest FW in this approach. (Apologies for
this, as that's a pretty big oversight on my part.) Dionna: Have you
thought about this? Presumably this requires some sort of paravirt for
the guest to ask the host. And for any paravirt interface, now we need
to think about if it degrades the security of the confidential VMs.
Though, using it to get the kernel version to decide whether or not to
accept the memory within the guest UEFI or mark it as unaccepted seems
fine from a security angle to me.

Also, tagging images with their underlying kernel versions still seems
susceptible to mis-labeling. But this seems like it can be mostly
"fixed" via automation (e.g., write a tool to boot the guest and ask
it what it's kernel version is and use the result to attach the tag).
Also, tagging the images with their kernel version seems like a much
more general solution to these sorts of issues.

Thoughts?