Re: [PATCH v13 003/113] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module

From: Huang, Kai
Date: Tue Mar 28 2023 - 21:13:57 EST


>
> > >
> > > + /*
> > > + * TDX requires those methods to enable VMXON by
> > > + * kvm_hardware_enable/disable_all()
> > > + */
> > > + static_call_update(kvm_x86_check_processor_compatibility,
> > > + ops->runtime_ops->check_processor_compatibility);
> > > + static_call_update(kvm_x86_hardware_enable,
> > > + ops->runtime_ops->hardware_enable);
> > > + static_call_update(kvm_x86_hardware_disable,
> > > + ops->runtime_ops->hardware_disable);
> > > r = ops->hardware_setup();
> > > if (r != 0)
> > > goto out_mmu_exit;
> >
> > Hmm.. I think this is ugly. Perhaps we should never do any
> > static_call(kvm_x86_xxx)() in hardware_setup(), because hardware_setup() is
> > called before kvm_ops_update() and may update vendor's kvm_x86_ops.
> >
> > So probably use hardware_enable_all() in hardware_setup() is a bad idea.
> >
> > I think we have below options on how to handle:
> >
> > 1) Use VMX's kvm_x86_ops directly in tdx_hardware_setup(). For instance,
> > something like below:
> >
> > int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops)
> > {
> > ...
> >
> > cpus_read_lock();
> > r = on_each_cpu(vt_x86_ops.hardware_enable, ...);
> > if (!r)
> > r = tdx_module_setup();
> > on_each_cpu(vt_x86_ops.hardware_disable, ...);
> > cpus_read_unlock();
> >
> > ...
> > }
> >
> > But this doesn't clean up nicely when there's some particular cpus fail to do
> > hardware_enable(). To clean up nicely, we do need additional things similar to
> > the hardware_enable_all() code path: a per-cpu variable or a cpumask_t + a
> > wrapper of vt_x86_ops->hardware_enable() to track which cpus have done
> > hardware_enable() successfully.
> >
> > 2) Move those static_call_update() into tdx_hardware_setup() so they are TDX
> > code self-contained. But this would require exposing kvm_x86_ops as symbol,
> > which isn't nice either.
> >
> > 3) Introduce another kvm_x86_init_ops->hardware_post_setup(), which is called
> > after kvm_ops_update().
> >
> > Personally, I think 3) perhaps is the most elegant one, but not sure whether
> > Sean/Paolo has any opinion.
>
> I think we can simply update the ops before calling hardware_enable() and
> clean up ops on failure.
>
>

This doesn't work because hardware_setup() may update vendor's kvm_x86_ops.

If you do kvm_ops_update() before hardware_setup(), you need to manually update
those updated (in hardware_setup()) callbacks again after.