Re: [PATCH] KVM: arm64: Fix confusion in documentation for pKVM SME assert

From: Mark Brown
Date: Tue Jun 04 2024 - 16:42:45 EST


On Tue, Jun 04, 2024 at 12:17:41PM -0700, Oliver Upton wrote:
> On Tue, Jun 04, 2024 at 07:47:01PM +0100, Mark Brown wrote:

> > As raised in the review comments for the original patch the assert and
> > comment added in afb91f5f8ad7 ("KVM: arm64: Ensure that SME controls are
> > disabled in protected mode") are bogus. The comments says that we check
> > that we do not have SME enabled for a pKVM guest but the assert actually
> > checks to see if the host has anything set in SVCR which is unrelated to
> > the guest features or state, regardless of if those guests are protected
> > or not.

> > What I believe the check is actually intended to validate is that we do
> > not enter the pKVM hypervisor with SME enabled since the pKVM hypervisor
> > does not yet understand SME and is therefore unable to save or restore
> > host state with SME enabled, indeed attempting to save SVE state would
> > fault if streaming mode is enabled on a system without FA64 due to FFR.
> > Update the comment to reflect this.

> The added context likely isn't necessary in what winds up getting
> applied. Can this just directly state the WARN_ON() exists b/c the
> protected mode hypervisor doesn't know how to manage SME state?

It could definitely be briefer, I was being super detailed because I was
guessing between the intent being what the comment says or what the code
says, or if we want an assert here at all for that matter. It seemed
better to be verbose.

> > WARN_ON(is_protected_kvm_enabled() && system_supports_sme() &&
> > read_sysreg_s(SYS_SVCR));

> While we're here, this should be WARN_ON_ONCE() or WARN_RATELIMIT() if
> we _really_ want some spam. But a single WARN ought to be enough.

Good point, and I agree that WARN_ON_ONCE is the better option.

> It'd be a good idea to also document why we're testing for SME state
> twice on the KVM_RUN path, as any WARN() in the hyp code is currently
> fatal. I'm guessing Fuad meant to have a non-fatal way of getting some
> debug information out.

Yes, this was one of the reasons I was unclear if the check or the
comment was the intention. Possibly it's due to getting much better
diagnostics for a warning generated in the host kernel than an error
returned from the hypervisor when we try to run the guest?

Attachment: signature.asc
Description: PGP signature