Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers

From: Michael Roth
Date: Thu Jan 27 2022 - 12:23:58 EST


On Wed, Jan 19, 2022 at 10:27:47AM -0600, Michael Roth wrote:
> On Wed, Jan 19, 2022 at 12:17:22PM +0100, Borislav Petkov wrote:
> > On Tue, Jan 18, 2022 at 07:18:06PM -0600, Michael Roth wrote:
> > > If 'fake_count'/'reported_count' is greater than the actual number of
> > > entries in the table, 'actual_count', then all table entries up to
> > > 'fake_count' will also need to pass validation. Generally the table
> > > will be zero'd out initially, so those additional/bogus entries will
> > > be interpreted as a CPUID leaves where all fields are 0. Unfortunately,
> > > that's still considered a valid leaf, even if it's a duplicate of the
> > > *actual* 0x0 leaf present earlier in the table. The current code will
> > > handle this fine, since it scans the table in order, and uses the
> > > valid 0x0 leaf earlier in the table.
> >
> > I guess it would be prudent to have some warnings when enumerating those
> > leafs and when the count index "goes off into the weeds", so to speak,
> > and starts reading 0-CPUID entries. I.e., "dear guest owner, your HV is
> > giving you a big lie: a weird/bogus CPUID leaf count..."
> >
> > :-)
>

Sorry for the delay, this response got stuck in my mail queue apparently.

> Ok, there's some sanity checks that happen a little later in boot via
> snp_cpuid_check_status(), after printk is enabled, that reports some
> basic details to dmesg like the number of entries in the table. I can
> add some additional sanity checks to flag the above case (really,
> all-zero entries never make sense, since CPUID 0x0 is supposed to report
> the max standard-range CPUID leaf, and leaf 0x1 at least should always
> be present). I'll print a warning for such cases, add maybe dump the
> cpuid the table in that case so it can be examined more easily by
> owner.
>
> >
> > And lemme make sure I understand it: the ->count itself is not
> > measured/encrypted because you want to be flexible here and supply
> > different blobs with different CPUID leafs?
>
> Yes, but to be clear it's the entire CPUID page, including the count,
> that's not measured (though it is encrypted after passing PSP
> validation). Probably the biggest reason is the logistics of having
> untrusted cloud vendors provide a copy of the CPUID values they plan
> to pass to the guest, since a new measurement would need to be
> calculated for every new configuration (using different guest
> cpuflags, SMP count, etc.), since those table values will need to be
> made easily-accessible to guest owner for all these measurement
> calculations, and they can't be trusted so each table would need to
> be checked either manually or by some tooling that could be difficult
> to implement unless it was something simple like "give me the expected
> CPUID values and I'll check if the provided CPUID table agrees with
> that".
>
> At that point it's much easier for the guest owner to just check the
> CPUID values directly against known good values for a particular
> configuration as part of their attestation process and leave the
> untrusted cloud vendor out of it completely. So not measuring the
> CPUID page as part of SNP attestation allows for that flexibility.
>
> >
> > > This is isn't really a special case though, it falls under the general
> > > category of a hypervisor inserting garbage entries that happen to pass
> > > validation, but don't reflect values that a guest would normally see.
> > > This will be detectable as part of guest owner attestation, since the
> > > guest code is careful to guarantee that the values seen after boot,
> > > once the attestation stage is reached, will be identical to the values
> > > seen during boot, so if this sort of manipulation of CPUID values
> > > occurred, the guest owner will notice this during attestation, and can
> > > abort the boot at that point. The Documentation patch addresses this
> > > in more detail.
> >
> > Yap, it is important this is properly explained there so that people can
> > pay attention to during attestation.
> >
> > > If 'fake_count' is less than 'actual_count', then the PSP skips
> > > validation for anything >= 'fake_count', and leaves them in the table.
> > > That should also be fine though, since guest code should never exceed
> > > 'fake_count'/'reported_count', as that's a blatant violation of the
> > > spec, and it doesn't make any sense for a guest to do this. This will
> > > effectively 'hide' entries, but those resulting missing CPUID leaves
> > > will be noticeable to the guest owner once attestation phase is
> > > reached.
> >
> > Noticeable because the guest owner did supply a CPUID table with X
> > entries but the HV is reporting Y?
>
> Or even more simply by the guest owner simply running 'cpuid -r -1' on
> the guest after boot, and making sure all the expected entries are
> present. If the HV manipulated the count to be lower, there would be
> missing entries, if they manipulated it to be higher, then there would
> either be extra duplicate entries at the end of the table (which the
> #VC handler would ignore due to it using the first matching entry in
> the table when doing lookups), or additional non-duplicate garbage
> entries, which will show up in 'cpuid -r -1' as unexpected entries.
>
> Really 'cpuid -r -1' is the guest owner/userspace view of things, so
> some of these nuances about the table contents might be noteworthy,
> but wouldn't actually affect guest behavior, which would be the main
> thing attestation process should be concerned with.
>
> >
> > If so, you can make this part of the attestation process: guest owners
> > should always check the CPUID entries count to be of a certain value.
> >
> > > This does all highlight the need for some very thorough guidelines
> > > on how a guest owner should implement their attestation checks for
> > > cpuid, however. I think a section in the reference implementation
> > > notes/document that covers this would be a good starting point. I'll
> > > also check with the PSP team on tightening up some of these CPUID
> > > page checks to rule out some of these possibilities in the future.
> >
> > Now you're starting to grow the right amount of paranoia - I'm glad I
> > was able to sensitize you properly!
> >
> > :-)))
>
> Hehe =*D
>
> Thanks!
>
> -Mike