Re: [PATCH 1/4] KVM: arm64: Trap FFA_VERSION host call in pKVM

From: Sebastian Ene
Date: Tue May 07 2024 - 05:17:43 EST


On Fri, May 03, 2024 at 05:21:14PM +0100, Will Deacon wrote:
> On Fri, May 03, 2024 at 03:29:12PM +0000, Sebastian Ene wrote:
> > On Fri, May 03, 2024 at 03:39:38PM +0100, Will Deacon wrote:
> > > On Thu, Apr 18, 2024 at 04:30:23PM +0000, Sebastian Ene wrote:
> > > > The pKVM hypervisor initializes with FF-A version 1.0. Keep the
> > > > supported version inside the host structure and prevent the host
> > > > drivers from overwriting the FF-A version with an increased version.
> > > > Without trapping the call, the host drivers can negotiate a higher
> > > > version number with TEE which can result in a different memory layout
> > > > described during the memory sharing calls.
> > > >
> > > > Signed-off-by: Sebastian Ene <sebastianene@xxxxxxxxxx>
> > > > ---
> > > > arch/arm64/kvm/hyp/nvhe/ffa.c | 43 ++++++++++++++++++++++++++++++++---
> > > > 1 file changed, 40 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kvm/hyp/nvhe/ffa.c b/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > index 320f2eaa14a9..023712e8beeb 100644
> > > > --- a/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > +++ b/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > @@ -58,6 +58,7 @@ struct kvm_ffa_buffers {
> > > > hyp_spinlock_t lock;
> > > > void *tx;
> > > > void *rx;
> > > > + u32 ffa_version;
> > > > };
> > >
> > > Why should this be part of 'struct kvm_ffa_buffers'? The host, proxy and
> > > Secure side will end up using the same version, so a simple global
> > > variable would suffice, no?
> > >
> > I prefer keeping it here as we will have more clients in the future /
> > different VMs and each one of them will have its own version and its own
> > pair of buffers.
>
> We can add that when we need to. Let's keep it simple for now, as the
> idea of the proxy having to support multiple versions of the spec at
> once sounds terrifying to me. I don't think we're going to want to
> re-marshall the data structures between the 1.0 and 1.1 formats, are we?
>

I don't think we increase the complexity of the code by keeping this
argument in the structure. The code in nvhe/ffa.c supports marshalling
the structure as of [this
change](https://lore.kernel.org/r/20231005-ffa_v1-1_notif-v4-14-cddd3237809c@xxxxxxx
) and that is why I was in favor of keeping the version where it belongs
to.

> > > > @@ -640,6 +641,39 @@ static bool do_ffa_features(struct arm_smccc_res *res,
> > > > return true;
> > > > }
> > > >
> > > > +static void do_ffa_version(struct arm_smccc_res *res,
> > > > + struct kvm_cpu_context *ctxt)
> > > > +{
> > > > + DECLARE_REG(u32, ffa_req_version, ctxt, 1);
> > > > + u32 current_version;
> > > > +
> > > > + hyp_spin_lock(&host_buffers.lock);
> > >
> > > Why do you need to take the lock for this?
> > >
> >
> > Because we interpret the host buffer content based on the version that we
> > end up setting here and each time we are accessing these buffers we are
> > protected by this lock.
>
> I think that's indicative of a broader issue, though, which is that you
> allow for the version to be re-negotiated at runtime. The spec doesn't
> allow that and I don't think we should either.
>

The spec talks about interopeartion in case two versions (x,y) and (a,b)
want to talk:

- given the pairs (x,y) and (a,b) x=major, y=minor if x == a and y > b
the versions are incompatible until y downgrades its version such that
y <= b.

>From this I drew the conclusion that the spec allows the re-negotiation
at runtime, please let me know if you see things differently.

> > > > + /*
> > > > + * If the client driver tries to downgrade the version, we need to ask
> > > > + * first if TEE supports it.
> > > > + */
> > > > + if (FFA_MINOR_VERSION(ffa_req_version) < FFA_MINOR_VERSION(current_version)) {
> > >
> > > Similarly here, I don't think 'current_version' is what we should expose.
> > > Rather, we should be returning the highest version that the proxy
> > > supports in the host, which is 1.0 at this point in the patch series.
> >
> > We already report the highest version that the proxy supports on line:
> > `res->a0 = current_version;`
> >
> > 'current_version' is assigned during proxy initialization.
> > This check allows us to downgrade the supported ffa_version if the Host
> > requested it and only if TF-A supports it.
>
> I don't think we want the host negotiating directly with the Secure side,
> though, do we? 'current_version' is initialised to whatever the Secure
> side gives us, so if we had something like:
>
> 1. Proxy initialises, issues FFA_VERSION(1.0)

This will save 1.0 in host_buffers.ffa_version

> 2. Secure implements 1.2 and so returns 1.2 but remains in 1.0
> compatability mode for the data structure formats.

Ack.

> 3. The host issues FFA_VERSION(1.1)

The call is trapped and we verify if the requested version
(FFA_VERSION(1.1) is smaller than our current_version saved in step 1.

Given that is not smaller we only reply with our current supported
version which is FFA_VERSION(1.0) and we return to the host.

> 4. The code above then issues FFA_VERSION(1.1) to Secure and it
> switches over to the 1.1 data structures

This was happening prior to my patch, so in a way this patch is a bugfix.
We get this behavior without trapping and interpretting
of the FFA_VERSION API.

>
> Did I get that right?
>
> I really think we need to settle on a single version for the host,
> hypervisor and Secure and then stick with it following a single
> negotiation stage.
>
> > > > + arm_smccc_1_1_smc(FFA_VERSION, ffa_req_version, 0,
> > > > + 0, 0, 0, 0, 0,
> > > > + res);
> > >
> > > Hmm, I'm struggling to see how this is supposed to work per the spec.
> > > The FF-A spec says:
> > >
> > > | ... negotiation of the version must happen before an invocation of
> > > | any other FF-A ABI.
> >
> > I think that is a bit vague in my opinion but what I get is that the first call
> > executed should always be the get version ff-a call.
> >
> > >
> > > and:
> > >
> > > | Once the caller invokes any FF-A ABI other than FFA_VERSION, the
> > > | version negotiation phase is complete.
> > > |
> > > | Once an FF-A version has been negotiated between a caller and a
> > > | callee, the version may not be changed for the lifetime of the
> > > | calling component. The callee must treat the negotiated version as
> > > | the only supported version for any subsequent interactions with the
> > > | caller.>
> > > So by the time we get here, we've already settled on our version with
> > > the Secure side and the host cannot downgrade.
> >
> > At this stage I think the spec didn't take into account that there can be a hypervisor
> > in between.
>
> Well, that's what the spec says and I think we need to follow it. I can
> well imagine that the Secure side won't allow the version to be
> re-negotiated on the fly and I don't think we'd want that in the proxy,
> either.
>
> > > That's a bit rubbish if you ask me, but I think it means we'll have to
> > > defer some of the proxy initialisation until the host calls FFA_VERSION,
> > > at which point we'll need to negotiate a common version between the host,
> > > the proxy and Secure. Once we've done that, our FFA_VERSION handler will
> > > just return that negotiated version.
> >
> > We are already doing this when the ARM driver is built as an external
> > module. If it is not as an external module and is builtin we have a
> > bigger issue because it loads before pKVM at subsys_initcall. This means
> > that we won't trap FFA_MAP* and other setup calls.
>
> Sorry, I don't follow. hyp_ffa_init() issues FFA_FEATURES immediately
> after FFA_VERSION, so we terminate the negotiation right away.

Sorry I confused you, I am afraid I was trying to desribe a different issue
here which is related to how early the ARM FF-A driver initializes when
is builtin - it is before the hypervisor proxy is installed.

>
> Will
>
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx.
>

Thank you,
Seb