Re: [PATCH 1/3] PM: domains: Drop the performance state vote for a device at detach

From: Ulf Hansson
Date: Mon Sep 06 2021 - 13:34:52 EST


On Mon, 6 Sept 2021 at 16:11, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>
> 06.09.2021 13:24, Ulf Hansson пишет:
> > On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>
> >> 03.09.2021 17:03, Ulf Hansson пишет:
> >>> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>>>
> >>>> 03.09.2021 11:22, Ulf Hansson пишет:
> >>>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>>>>>
> >>>>>> 02.09.2021 13:16, Ulf Hansson пишет:
> >>>>>>> When a device is detached from its genpd, genpd loses track of the device,
> >>>>>>> including its performance state vote that may have been requested for it.
> >>>>>>>
> >>>>>>> Rather than relying on the consumer driver to drop the performance state
> >>>>>>> vote for its device, let's do it internally in genpd when the device is
> >>>>>>> getting detached. In this way, we makes sure that the aggregation of the
> >>>>>>> votes in genpd becomes correct.
> >>>>>>
> >>>>>> This is a dangerous behaviour in a case where performance state
> >>>>>> represents voltage. If hardware is kept active on detachment, say it's
> >>>>>> always-on, then it may be a disaster to drop the voltage for the active
> >>>>>> hardware.
> >>>>>>
> >>>>>> It's safe to drop performance state only if you assume that there is a
> >>>>>> firmware behind kernel which has its own layer of performance management
> >>>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.
> >>>>>>
> >>>>>> The performance state should be persistent for a device and it should be
> >>>>>> controlled in a conjunction with runtime PM. If platform wants to drop
> >>>>>> performance state to zero on detachment, then this behaviour should be
> >>>>>> specific to that platform.
> >>>>>
> >>>>> I understand your concern, but at this point, genpd can't help to fix this.
> >>>>>
> >>>>> Genpd has no information about the device, unless it's attached to it.
> >>>>> For now and for these always on HWs, we simply need to make sure the
> >>>>> device stays attached, in one way or the other.
> >>>>
> >>>> This indeed requires to redesign GENPD to make it more coupled with a
> >>>> device, but this is not a real problem for any of the current API users
> >>>> AFAIK. Ideally the state should be persistent to make API more universal.
> >>>
> >>> Right. In fact this has been discussed in the past. In principle, the
> >>> idea was to attach to genpd at device registration, rather than at
> >>> driver probe.
> >>>
> >>> Although, this is not very easy to implement - and it seems like the
> >>> churns to do, have not been really worth it. At least so far.
> >>>
> >>>>
> >>>> Since for today we assume that device should be suspended at the time of
> >>>> the detachment (if the default OPP state isn't used), it may be better
> >>>> to add a noisy warning message if pstate!=0, keeping the state untouched
> >>>> if it's not zero.
> >>>
> >>> That would just be very silly in my opinion.
> >>>
> >>> When the device is detached (suspended or not), it may cause it's PM
> >>> domain to be powered off - and there is really nothing we can do about
> >>> that from the genpd point of view.
> >>>
> >>> As stated, the only current short term solution is to avoid detaching
> >>> the device. Anything else, would just be papering of the issue.
> >>
> >> What about to re-evaluate the performance state of the domain after
> >> detachment instead of setting the state to zero?
> >
> > I am not suggesting to set the performance state of the genpd to zero,
> > but to drop a potential vote for a performance state for the *device*
> > that is about to be detached.
>
> By removing the vote of the *device*, you will drop the performance
> state of the genpd. If device is active and it's wrong to drop its
> state, then you may cause the damage.
>
> > Calling genpd_set_performance_state(dev, 0), during detach will have
> > the same effect as triggering a re-evaluation of the performance state
> > for the genpd, but after the detach.
>
> Yes
>
> >> This way PD driver may
> >> take an action on detachment if performance isn't zero, before hardware
> >> is crashed, for example it may emit a warning.
> >
> > Not sure I got that. Exactly when do you want to emit a warning and
> > for what reason?
> >
> > Do you want to add a check somewhere to see if
> > 'gpd_data->performance_state' is non zero - and then print a warning?
>
> I want to check the 'gpd_data->performance_state' from the detachment
> callback and emit the warning + lock further performance changes in the
> PD driver since it's a error condition.

Alright, so if I understand correctly, you intend to do the check for
the "error condition" of the device in the genpd->detach_dev()
callback?

What exactly do you intend to do beyond this point, if you detect the
"error condition"? Locking further changes of the performance state
seems fragile too, especially if some other device/driver requires the
performance state to be raised. It sounds like you simply need to call
BUG_ON() then?

Also note that a very similar problem exists, *before* the device gets
attached in the first place. More precisely, nothing prevents the
performance state from being set to a non-compatible value for an
always-on HW/device that hasn't been attached yet. So maybe you need
to set the maximum performance state at genpd initializations, then
use the ->sync_state() callback to very that all consumers have been
attached to the genpd provider, before allowing the state to be
changed/lowered?

Kind regards
Uffe