Re: [Skiboot] [PATCH v7 0/4] Support for Self Save API in OPAL

From: Nicholas Piggin
Date: Thu Apr 16 2020 - 05:21:00 EST


Excerpts from Pratik Rajesh Sampat's message of April 16, 2020 5:53 pm:
> v6: https://lists.ozlabs.org/pipermail/skiboot/2020-March/016645.html
> Changelog
> v6 --> v7
> 1. Addressed comments from Gautham for reporting warnings and errors
>
> Background
> ==========
>
> The power management framework on POWER systems include core idle
> states that lose context. Deep idle states namely "winkle" on POWER8
> and "stop4" and "stop5" on POWER9 can be entered by a CPU to save
> different levels of power, as a consequence of which all the
> hypervisor resources such as SPRs and SCOMs are lost.
>
> For most SPRs, saving and restoration of content for SPRs and SCOMs
> is handled by the hypervisor kernel prior to entering an post exit
> from an idle state respectively. However, there is a small set of
> critical SPRs and XSCOMs that are expected to contain sane values even
> before the control is transferred to the hypervisor kernel at system
> reset vector.
>
> For this purpose, microcode firmware provides a mechanism to restore
> values on certain SPRs. The communication mechanism between the
> hypervisor kernel and the microcode is a standard interface called
> sleep-winkle-engine (SLW) on Power8 and Stop-API on Power9 which is
> abstracted by OPAL calls from the hypervisor kernel. The Stop-API
> provides an interface known as the self-restore API, to which the SPR
> number and a predefined value to be restored on wake-up from a deep
> stop state is supplied.
>
>
> Motivation to introduce a new Stop-API
> ======================================
>
> The self-restore API expects not just the SPR number but also the
> value with which the SPR is restored. This is good for those SPRs such
> as HSPRG0 whose values do not change at runtime, since for them, the
> kernel can invoke the self-restore API at boot time once the values of
> these SPRs are determined.
>
> However, there are use-cases where-in the value to be saved cannot be
> known or cannot be updated in the layer it currently is.
> The shortcomings and the new use-cases which cannot be served by the
> existing self-restore API, serves as motivation for a new API:

Thanks for writing this up, it goes some way to help think about the
feature.

> Shortcoming1:
> ------------
> In a special wakeup scenario when a CPU is woken up in stop4/5 and
> after the task is done, the HCODE puts it back to stop. The value of
> PSSCR is passed to the HCODE via the self-restore API. The kernel
> currently provides the value of the deepest stop state due to being
> conservative. Thus if a core that was in stop4 was woken up due to
> special wakeup, the HCODE will now put it back to stop5 thus increasing
> the subsequent wakeup latency to ~200us.
> A mechanism is needed in place to update the PSSCR value each time the
> core is woken up due to special wakeup.

This seems like a shortcoming of the wakeup firmware that shouldn't need
any APIs to the kernel to solve, but the whole deep sleep wakeup seems
like a shortcoming so let's assume they won't do that for whatever
reason, then how much of a problem is this really? Are special wakeups
that frequent?

> Shortcoming2:
> ------------
> The value of LPCR is dynamic based on if the CPU is entered a stop
> state during cpu idle versus cpu hotplug.
> Today, an additional self-restore call is made before entering
> CPU-Hotplug to clear the PECE1 bit in stop-API so that if we are
> woken up by a special wakeup on an offlined CPU, we go back to stop
> with the the bit cleared.
> There is a overhead of an extra call

This is a self-restore call when we offline or online a CPU? That's not
a real problem either, is it?

> New Use-case:
> -------------
> In the case where the hypervisor is running on an
> ultravisor environment, the boot time is too late in the cycle to make
> the self-restore API calls, as these cannot be invoked from an
> non-secure context anymore
>
> To address these shortcomings, the firmware provides another API known
> as the self-save API. The self-save API only takes the SPR number as a
> parameter and will ensure that on wakeup from a deep-stop state the
> SPR is restored with the value that it contained prior to entering the
> deep-stop.
>

If the ultravisor is deployed in production only systems where we don't
use runtime deep-stop states, do we need to handle this case?

Thanks,
Nick