Re: [PATCH V3 4/4] crypto: ccp - Add SEV_INIT_EX support
From: Marc Orr
Date: Fri Nov 12 2021 - 13:28:17 EST
On Fri, Nov 12, 2021 at 9:49 AM Peter Gonda <pgonda@xxxxxxxxxx> wrote:
>
> On Fri, Nov 12, 2021 at 10:46 AM Marc Orr <marcorr@xxxxxxxxxx> wrote:
> >
> > On Fri, Nov 12, 2021 at 8:55 AM Peter Gonda <pgonda@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Nov 10, 2021 at 8:32 AM Peter Gonda <pgonda@xxxxxxxxxx> wrote:
> > > >
> > > > On Tue, Nov 9, 2021 at 3:20 PM Brijesh Singh <brijesh.singh@xxxxxxx> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 11/9/21 2:46 PM, Peter Gonda wrote:
> > > > > > On Tue, Nov 9, 2021 at 1:26 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > > > >>
> > > > > >> On Tue, Nov 09, 2021, Peter Gonda wrote:
> > > > > >>> On Tue, Nov 9, 2021 at 10:21 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > > > >>>> There's no need for this to be a function pointer, and the duplicate code can be
> > > > > >>>> consolidated.
> > > > > >>>>
> > > > > >>>> static int sev_do_init_locked(int cmd, void *data, int *error)
> > > > > >>>> {
> > > > > >>>> if (sev_es_tmr) {
> > > > > >>>> /*
> > > > > >>>> * Do not include the encryption mask on the physical
> > > > > >>>> * address of the TMR (firmware should clear it anyway).
> > > > > >>>> */
> > > > > >>>> data.flags |= SEV_INIT_FLAGS_SEV_ES;
> > > > > >>>> data.tmr_address = __pa(sev_es_tmr);
> > > > > >>>> data.tmr_len = SEV_ES_TMR_SIZE;
> > > > > >>>> }
> > > > > >>>> return __sev_do_cmd_locked(SEV_CMD_INIT, &data, error);
> > > > > >>>> }
> > > > > >>>>
> > > > > >>>> static int __sev_init_locked(int *error)
> > > > > >>>> {
> > > > > >>>> struct sev_data_init data;
> > > > > >>>>
> > > > > >>>> memset(&data, 0, sizeof(data));
> > > > > >>>> return sev_do_init_locked(cmd, &data, error);
> > > > > >>>> }
> > > > > >>>>
> > > > > >>>> static int __sev_init_ex_locked(int *error)
> > > > > >>>> {
> > > > > >>>> struct sev_data_init_ex data;
> > > > > >>>>
> > > > > >>>> memset(&data, 0, sizeof(data));
> > > > > >>>> data.length = sizeof(data);
> > > > > >>>> data.nv_address = __psp_pa(sev_init_ex_nv_address);
> > > > > >>>> data.nv_len = NV_LENGTH;
> > > > > >>>> return sev_do_init_locked(SEV_CMD_INIT_EX, &data, error);
> > > > > >>>> }
> > > > > >>>
> > > > > >>> I am missing how this removes the duplication of the retry code,
> > > > > >>> parameter checking, and other error checking code.. With what you have
> > > > > >>> typed out I would assume I still need to function pointer between
> > > > > >>> __sev_init_ex_locked and __sev_init_locked. Can you please elaborate
> > > > > >>> here?
> > > > > >>
> > > > > >> Hmm. Ah, I got distracted between the original thought, the realization that
> > > > > >> the two commands used different structs, and typing up the above.
> > > > > >>
> > > > > >>> Also is there some reason the function pointer is not acceptable?
> > > > > >>
> > > > > >> It's not unacceptable, it would just be nice to avoid, assuming the alternative
> > > > > >> is cleaner. But I don't think any alternative is cleaner, since as you pointed
> > > > > >> out the above is a half-baked thought.
> > > > > >
> > > > > > OK I'll leave as is.
> > > > > >
> > > > > >>
> > > > > >>>>> + rc = init_function(error);
> > > > > >>>>> if (rc && *error == SEV_RET_SECURE_DATA_INVALID) {
> > > > > >>>>> /*
> > > > > >>>>> * INIT command returned an integrity check failure
> > > > > >>>>> @@ -286,8 +423,8 @@ static int __sev_platform_init_locked(int *error)
> > > > > >>>>> * failed and persistent state has been erased.
> > > > > >>>>> * Retrying INIT command here should succeed.
> > > > > >>>>> */
> > > > > >>>>> - dev_dbg(sev->dev, "SEV: retrying INIT command");
> > > > > >>>>> - rc = __sev_do_cmd_locked(SEV_CMD_INIT, &data, error);
> > > > > >>>>> + dev_notice(sev->dev, "SEV: retrying INIT command");
> > > > > >>>>> + rc = init_function(error);
> > > > > >>>>
> > > > > >>>> The above comment says "persistent state has been erased", but __sev_do_cmd_locked()
> > > > > >>>> only writes back to the file if a relevant command was successful, which means
> > > > > >>>> that rereading the userspace file in __sev_init_ex_locked() will retry INIT_EX
> > > > > >>>> with the same garbage data.
> > > > > >>>
> > > > > >>> Ack my mistake, that comment is stale. I will update it so its correct
> > > > > >>> for the INIT and INIT_EX flows.
> > > > > >>>>
> > > > > >>>> IMO, the behavior should be to read the file on load and then use the kernel buffer
> > > > > >>>> without ever reloading (unless this is built as a module and is unloaded and reloaded).
> > > > > >>>> The writeback then becomes opportunistic in the sense that if it fails for some reason,
> > > > > >>>> the kernel's internal state isn't blasted away.
> > > > > >>>
> > > > > >>> One issue here is that the file read can fail on load so we use the
> > > > > >>> late retry to guarantee we can read the file.
> > > > > >>
> > > > > >> But why continue loading if reading the file fails on load?
> > > > > >>
> > > > > >>> The other point seems like preference. Users may wish to shutdown the PSP FW,
> > > > > >>> load a new file, and INIT_EX again with that new data. Why should we preclude
> > > > > >>> them from that functionality?
> > > > > >>
> > > > > >> I don't think we should preclude that functionality, but it needs to be explicitly
> > > > > >> tied to a userspace action, e.g. either on module load or on writing the param to
> > > > > >> change the path. If the latter is allowed, then it needs to be denied if the PSP
> > > > > >> is initialized, otherwise the kernel will be in a non-coherent state and AFAICT
> > > > > >> userspace will have a heck of a time even understanding what state has been used
> > > > > >> to initialize the PSP.
> > > > > >
> > > > > > If this driver is builtin the filesystem will be unavailable during
> > > > > > __init. Using the existing retries already built into
> > > > > > sev_platform_init() also the file to be read once userspace is
> > > > > > running, meaning the file system is usable. As I tried to explain in
> > > > > > the commit message. We could remove the sev_platform_init call during
> > > > > > sev_pci_init since this only actually needs to be initialized when the
> > > > > > first command requiring it is issues (either reading some keys/certs
> > > > > > from the PSP or launching an SEV guest). Then userspace in both the
> > > > > > builtin and module usage would know running one of those commands
> > > > > > cause the file to be read for PSP usage. Tom any thoughts on this?
> > > > > >
> > > > >
> > > > > One thing to note is that if we do the INIT on the first command then
> > > > > the first guest launch will take a longer. The init command is not
> > > > > cheap (especially with the SNP, it may take a longer because it has to
> > > > > do all those RMP setup etc). IIRC, in my early SEV series in I was doing
> > > > > the INIT during the first command execution and based on the
> > > > > recommendation moved to do the init on probe.
> > > > >
> > > > > Should we add a module param to control whether to do INIT on probe or
> > > > > delay until the first command ?
> > > >
> > > > Thats a good point Brijesh. I've only been testing this with SEV and
> > > > ES so haven't noticed that long setup time. I like the idea of a
> > > > module parameter to decide when to INIT, that should satisfy Sean's
> > > > concern that the user doesn't know when the INIT_EX file would be read
> > > > and that there is extra retry code (duplicated between sev_pci_init
> > > > and all the PSP commands). I'll get started on that.
> > >
> > > I need a little guidance on how to proceed with this. Should I have
> > > the new module parameter 'psp_init_on_probe' just disable PSP init on
> > > module init if false. Or should it also disable PSP init during
> > > command flow if it's true?
> > >
> > > I was thinking I should just have 'psp_init_on_probe' default to true,
> > > and if false it stops the PSP init during sev_pci_init(). If I add the
> > > second change that seems like it changes the ABI. Thoughts?
> >
> > What about doing the INIT when we load the KVM module? Does that
> > resolve all of these problems? By the time we load the KVM module, we
> > know that the file system is up, which is the original problem we were
> > trying to solve. And the KVM module is most likely loaded before we
> > run the first guest.
>
> KVM can be compiled as Y as well right? Then KVM module init is still too early.
I think even with KVM built in, it's guaranteed to load after the file system:
* KVM is loaded using `module_init()` (e.g., kvm-amd `module_init()` [1]).
* `module_init()` is defined as `__initcall()` [2].
* `__initcall()` is defined as `device_initcall()` [3].
* Finally, looking at [3] and scrolling up a few lines,
`device_init_call()`'s appear to happen after the file system init
calls.
[1] https://elixir.bootlin.com/linux/latest/source/arch/x86/kvm/svm/svm.c#L4673
[2] https://elixir.bootlin.com/linux/latest/source/include/linux/module.h#L88
[3] https://elixir.bootlin.com/linux/latest/source/include/linux/init.h#L296