Re: [PATCH v2 2/4] pmdomain: Respect the CPU system-wakeup QoS limit during s2idle
From: Ulf Hansson
Date: Thu Oct 30 2025 - 08:32:50 EST
On Thu, 30 Oct 2025 at 13:23, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>
> On Thu, Oct 30, 2025 at 1:00 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> >
> > On Thu, 30 Oct 2025 at 11:45, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Oct 16, 2025 at 5:19 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> > > >
> > > > A CPU system-wakeup QoS limit may have been requested by user-space. To
> > > > avoid breaking this constraint when entering a low-power state during
> > > > s2idle through genpd, let's extend the corresponding genpd governor for
> > > > CPUs. More precisely, during s2idle let the genpd governor select a
> > > > suitable low-power state, by taking into account the QoS limit.
> > > >
> > > > Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > > > ---
> > > >
> > > > Changes in v2:
> > > > - Limite the change to the genpd governor for CPUs.
> > > >
> > > > ---
> > > > drivers/pmdomain/core.c | 10 ++++++++--
> > > > drivers/pmdomain/governor.c | 27 +++++++++++++++++++++++++++
> > > > include/linux/pm_domain.h | 1 +
> > > > 3 files changed, 36 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
> > > > index 61c2277c9ce3..4fd546ef0448 100644
> > > > --- a/drivers/pmdomain/core.c
> > > > +++ b/drivers/pmdomain/core.c
> > > > @@ -1425,8 +1425,14 @@ static void genpd_sync_power_off(struct generic_pm_domain *genpd, bool use_lock,
> > > > return;
> > > > }
> > > >
> > > > - /* Choose the deepest state when suspending */
> > > > - genpd->state_idx = genpd->state_count - 1;
> > > > + if (genpd->gov && genpd->gov->system_power_down_ok) {
> > > > + if (!genpd->gov->system_power_down_ok(&genpd->domain))
> > > > + return;
> > > > + } else {
> > > > + /* Default to the deepest state. */
> > > > + genpd->state_idx = genpd->state_count - 1;
> > > > + }
> > > > +
> > > > if (_genpd_power_off(genpd, false)) {
> > > > genpd->states[genpd->state_idx].rejected++;
> > > > return;
> > > > diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
> > > > index 39359811a930..bd1b9d66d4a5 100644
> > > > --- a/drivers/pmdomain/governor.c
> > > > +++ b/drivers/pmdomain/governor.c
> > > > @@ -415,9 +415,36 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
> > > > return false;
> > > > }
> > > >
> > > > +static bool cpu_system_power_down_ok(struct dev_pm_domain *pd)
> > > > +{
> > > > + s64 constraint_ns = cpu_wakeup_latency_qos_limit() * NSEC_PER_USEC;
> > >
> > > I'm not sure why genpd needs to take cpu_wakeup_latency_qos_limit()
> > > into account directly.
> > >
> > > It should be told by cpuidle which state has been selected on the CPU
> > > side and it should not go any deeper than that anyway.
> >
> > For PSCI OS-initiated mode, cpuidle doesn't know about the states that
> > may be shared among a group of CPUs.
> >
> > Instead, those states are controlled through the PM domain topology by
> > genpd and its governor, hence this is needed too.
>
> All right, but I'd like to understand how all of that works.
>
> So cpuidle selects a state to enter for the given CPU and then genpd
> is invoked. It has to take the exit latency of that state into
> account, so it doesn't go too deep. How does it do that?
Depending on the state selected, in cpuidle-psci.c we may end up
calling __psci_enter_domain_idle_state() (only for the deepest
CPU-state).
For s2idle this means we call dev_pm_genpd_suspend|resume(), to manage
the reference counting of the PM domains via genpd. This then may lead
to that genpd_sync_power_off() tries to select a state by calling the
new governor function above.
Did that make sense?
Kind regards
Uffe