Re: [PATCH v3 0/4] Allow genpd providers to power off domains on sync state

From: Abel Vesa
Date: Thu Apr 06 2023 - 05:26:17 EST


On 23-04-05 16:11:18, Ulf Hansson wrote:
> Abel, Saravana,
>
> On Fri, 31 Mar 2023 at 06:59, Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
> >
> > On 23-03-30 12:50:44, Saravana Kannan wrote:
> > > On Thu, Mar 30, 2023 at 4:27 AM Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
> > > >
> > > > On 23-03-27 17:17:28, Saravana Kannan wrote:
> > > > > On Mon, Mar 27, 2023 at 12:38 PM Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > There have been already a couple of tries to make the genpd "disable
> > > > > > unused" late initcall skip the powering off of domains that might be
> > > > > > needed until later on (i.e. until some consumer probes). The conclusion
> > > > > > was that the provider could return -EBUSY from the power_off callback
> > > > > > until the provider's sync state has been reached. This patch series tries
> > > > > > to provide a proof-of-concept that is working on Qualcomm platforms.
> > > > >
> > > > > I'm giving my thoughts in the cover letter instead of spreading it
> > > > > around all the patches so that there's context between the comments.
> > > > >
> > > > > 1) Why can't all the logic in this patch series be implemented at the
> > > > > framework level? And then allow the drivers to opt into this behavior
> > > > > by setting the sync_state() callback.
> > > > >
> > > > > That way, you can land it only for QC drivers by setting up
> > > > > sync_state() callback only for QC drivers, but actually have the same
> > > > > code function correctly for non-QC drivers too. And then once we have
> > > > > this functionality working properly for QC drivers for one kernel
> > > > > version (or two), we'll just have the framework set the device's
> > > > > driver's sync_state() if it doesn't have one already.
> > > >
> > > > I think Ulf has already NACK'ed that approach here:
> > > > [1] https://lore.kernel.org/lkml/CAPDyKFon35wcQ+5kx3QZb-awN_S_q8y1Sir-G+GoxkCvpN=iiA@xxxxxxxxxxxxxx/
> > >
> > > I would have NACK'ed that too because that's an incomplete fix. As I
> > > said further below, the fix needs to be at the aggregation level where
> > > you aggregate all the current consumer requests. In there, you need to
> > > add in the "state at boot" input that gets cleared out after a
> > > sync_state() call is received for that power domain.
> > >
> >
> > So, just to make sure I understand your point. You would rather have the
> > genpd_power_off check if 'state at boot' is 'on' and return busy and
> > then clear then, via a generic genpd sync state you would mark 'state at
> > boot' as 'off' and queue up a power off request for each PD from there.
> > And as for 'state at boot' it would check the enable bit through
> > provider.
> >
> > Am I right so far?
>
> I am not sure I completely follow what you are suggesting here.

Please have a look at this:
https://git.kernel.org/pub/scm/linux/kernel/git/abelvesa/linux.git/commit/?h=qcom/genpd/ignore_unused_until_sync_state&id=4f9e6140dfe77884012383f8ba2140cadb62ca4a

Keep in mind that is WIP for now. Once I have something, I'll post it on
mailing list. Right now, there is a missing piece mentioned in that
commit message.

>
> Although, let me point out that there is no requirement from the genpd
> API point of view, that the provider needs to be a driver. This means
> that the sync_state callback may not even be applicable for all genpd
> providers.

Yes, I'm considering that case too.

>
> In other words, it looks to me that we may need some new genpd helper
> functions, no matter what. More importantly, it looks like we need an
> opt-in behaviour, unless we can figure out a common way for genpd to
> understand whether the sync_state thing is going to be applicable or
> not. Maybe Saravana has some ideas around this?
>
> Note that, I don't object to extending genpd to be more clever and to
> share common code, of course. We could, for example, make
> genpd_power_off() to bail out earlier, rather than calling the
> ->power_off() callback and waiting for it to return -EBUSY. Both of
> you have pointed this out to me, in some of the earlier
> replies/discussions too.

The above link basically does this. I hope this is what Saravana has in
mind as well.

>
> >
> > > > And suggested this new approach that this patch series proposes.
> > > > (Unless I missunderstood his point)
> > > >
> > > > >
> > > > > 2) sync_state() is not just about power on/off. It's also about the
> > > > > power domain level. Can you handle that too please?
> > > >
> > > > Well, this patchset only tries to delay the disabling of unused power
> > > > domains until all consumers have had a chance to probe. So we use sync
> > > > state only to queue up a power-off request to make sure those unused
> > > > ones get disabled.
> > >
> > > Sure, but the design is completely unusable for a more complete
> > > sync_state() behavior. I'm okay if you want to improve the
> > > sync_state() behavior in layers, but don't do it in a way where the
> > > current design will definitely not work for what you want to add in
> > > the future.
> >
> > But you would still be OK with the qcom_cc sync state wrapper, I guess,
> > right? Your concern is only about the sync state callback being not
> > genpd generic one, AFAIU.
> >
> > >
> > > > >
> > > > > 3) In your GDSC drivers, it's not clear to me if you are preventing
> > > > > power off until sync_state() only for GDSCs that were already on at
> > > > > boot. So if an off-at-boot GDSC gets turned on, and then you attempt
> > > > > to turn it off before all its consumers have probed, it'll fail to
> > > > > power it off even though that wasn't necessary?
> > > >
> > > > I think we can circumvent looking at a GDSC by knowing it there was ever
> > > > a power on request since boot. I'll try to come up with something in the
> > > > new version.
> > >
> > > Please no. There's nothing wrong with reading the GDSC values. Please
> > > read them and don't turn on GDSC's that weren't on at boot.
> >
> > Sorry for the typos above, I basically said that for this concern of
> > yours, we can add the 'state at boot' thing you mentioned above by
> > looking at the GDSC (as in reading reg).
> >
> > >
> > > Otherwise you are making it a hassle for the case where there is a
> > > consumer without a driver for a GDSC that was off at boot. You are now
> > > forcing the use of timeouts or writing to state_synced file. Those
> > > should be absolute last resorts, but you are making that a requirement
> > > with your current implementation. If you implement it correctly by
> > > reading the GDSC register, things will "just work". And it's not even
> > > hard to do.
> > >
> > > NACK'ed until this is handled correctly.
> > >
> > > >
> > > > >
> > > > > 4) The returning -EBUSY when a power off is attempted seems to be
> > > > > quite wasteful. The framework will go through the whole sequence of
> > > > > trying to power down, send the notifications and then fail and then
> > > > > send the undo notifications. Combined with point (2) I think this can
> > > > > be handled better at the aggregation level in the framework to avoid
> > > > > even going that far into the power off sequence.
> > > >
> > > > Again, have a look at [1] (above).
> > >
> > > See my reply above. If you do it properly at the framework level, this
> > > can be done in a clean way and will work for all power domains.
> > >
> > > -Saravana
> > >
> > > >
> > > > Ulf, any thoughts on this 4th point?
>
> Please, see my reply above.
>
> [...]
>
> Kind regards
> Uffe