Re: [PATCH 14/18] drivers: firmware: psci: Manage runtime PM in the idle path for CPUs

From: Lina Iyer
Date: Thu Jul 18 2019 - 13:41:32 EST

On Thu, Jul 18 2019 at 10:55 -0600, Ulf Hansson wrote:
On Thu, 18 Jul 2019 at 15:31, Lorenzo Pieralisi
<lorenzo.pieralisi@xxxxxxx> wrote:

On Thu, Jul 18, 2019 at 12:35:07PM +0200, Ulf Hansson wrote:
> On Tue, 16 Jul 2019 at 17:53, Lorenzo Pieralisi
> <lorenzo.pieralisi@xxxxxxx> wrote:
> >
> > On Mon, May 13, 2019 at 09:22:56PM +0200, Ulf Hansson wrote:
> > > When the hierarchical CPU topology layout is used in DT, let's allow the
> > > CPU to be power managed through its PM domain, via deploying runtime PM
> > > support.
> > >
> > > To know for which idle states runtime PM reference counting is needed,
> > > let's store the index of deepest idle state for the CPU, in a per CPU
> > > variable. This allows psci_cpu_suspend_enter() to compare this index with
> > > the requested idle state index and then act accordingly.
> >
> > I do not see why a system with two CPU CPUidle states, say CPU retention
> > and CPU shutdown, should not be calling runtime PM on CPU retention
> > entry.
> If the CPU idle governor did select the CPU retention for the CPU, it
> was probably because the target residency for the CPU shutdown state
> could not be met.

The kernel does not know what those cpu states represent, so, this is an
assumption you are making and it must be made clear that this code works
as long as your assumption is valid.

If eg a "cluster" retention state has lower target_residency than
the deepest CPU idle state this assumption is wrong.

Good point, you are right. I try to find a place to document this assumption.

And CPUidle and genPD governor decisions are not synced anyway so,
again, this is an assumption, not a certainty.

> In this case, there is no point in allowing any other deeper idle
> states for cluster/package/system, since those have even greater
> residencies, hence calling runtime PM doesn't make sense.

On the systems you are testing on.

So what you are saying typically means, that if all CPUs in the same
cluster have entered the CPU retention state, on some system the
cluster may also put into a cluster retention state (assuming the
target residency is met)?

Do you know of any systems that has these characteristics?

Many QCOM SoCs can do that. But with the hardware improving, the
power-performance benefits skew the results in favor of powering off
the cluster than keeping the CPU and cluster in retention.

Kevin H and I thought of this problem earlier on. But that is a second
level problem to solve and definitely to be thought of after we have the
support for the deepest states in the kernel. We left that out for a
later date. The idea would have been to setup the allowable state(s) in
the DT for CPU and cluster state definitions and have the genpd take
that into consideration when deciding the idle state for the domain.