Re: [PATCH 1/3] PM: domains: Drop the performance state vote for a device at detach

From: Dmitry Osipenko
Date: Mon Sep 06 2021 - 15:33:34 EST


06.09.2021 20:34, Ulf Hansson пишет:
> On Mon, 6 Sept 2021 at 16:11, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>>
>> 06.09.2021 13:24, Ulf Hansson пишет:
>>> On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>>>>
>>>> 03.09.2021 17:03, Ulf Hansson пишет:
>>>>> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>>>>>>
>>>>>> 03.09.2021 11:22, Ulf Hansson пишет:
>>>>>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> 02.09.2021 13:16, Ulf Hansson пишет:
>>>>>>>>> When a device is detached from its genpd, genpd loses track of the device,
>>>>>>>>> including its performance state vote that may have been requested for it.
>>>>>>>>>
>>>>>>>>> Rather than relying on the consumer driver to drop the performance state
>>>>>>>>> vote for its device, let's do it internally in genpd when the device is
>>>>>>>>> getting detached. In this way, we makes sure that the aggregation of the
>>>>>>>>> votes in genpd becomes correct.
>>>>>>>>
>>>>>>>> This is a dangerous behaviour in a case where performance state
>>>>>>>> represents voltage. If hardware is kept active on detachment, say it's
>>>>>>>> always-on, then it may be a disaster to drop the voltage for the active
>>>>>>>> hardware.
>>>>>>>>
>>>>>>>> It's safe to drop performance state only if you assume that there is a
>>>>>>>> firmware behind kernel which has its own layer of performance management
>>>>>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.
>>>>>>>>
>>>>>>>> The performance state should be persistent for a device and it should be
>>>>>>>> controlled in a conjunction with runtime PM. If platform wants to drop
>>>>>>>> performance state to zero on detachment, then this behaviour should be
>>>>>>>> specific to that platform.
>>>>>>>
>>>>>>> I understand your concern, but at this point, genpd can't help to fix this.
>>>>>>>
>>>>>>> Genpd has no information about the device, unless it's attached to it.
>>>>>>> For now and for these always on HWs, we simply need to make sure the
>>>>>>> device stays attached, in one way or the other.
>>>>>>
>>>>>> This indeed requires to redesign GENPD to make it more coupled with a
>>>>>> device, but this is not a real problem for any of the current API users
>>>>>> AFAIK. Ideally the state should be persistent to make API more universal.
>>>>>
>>>>> Right. In fact this has been discussed in the past. In principle, the
>>>>> idea was to attach to genpd at device registration, rather than at
>>>>> driver probe.
>>>>>
>>>>> Although, this is not very easy to implement - and it seems like the
>>>>> churns to do, have not been really worth it. At least so far.
>>>>>
>>>>>>
>>>>>> Since for today we assume that device should be suspended at the time of
>>>>>> the detachment (if the default OPP state isn't used), it may be better
>>>>>> to add a noisy warning message if pstate!=0, keeping the state untouched
>>>>>> if it's not zero.
>>>>>
>>>>> That would just be very silly in my opinion.
>>>>>
>>>>> When the device is detached (suspended or not), it may cause it's PM
>>>>> domain to be powered off - and there is really nothing we can do about
>>>>> that from the genpd point of view.
>>>>>
>>>>> As stated, the only current short term solution is to avoid detaching
>>>>> the device. Anything else, would just be papering of the issue.
>>>>
>>>> What about to re-evaluate the performance state of the domain after
>>>> detachment instead of setting the state to zero?
>>>
>>> I am not suggesting to set the performance state of the genpd to zero,
>>> but to drop a potential vote for a performance state for the *device*
>>> that is about to be detached.
>>
>> By removing the vote of the *device*, you will drop the performance
>> state of the genpd. If device is active and it's wrong to drop its
>> state, then you may cause the damage.
>>
>>> Calling genpd_set_performance_state(dev, 0), during detach will have
>>> the same effect as triggering a re-evaluation of the performance state
>>> for the genpd, but after the detach.
>>
>> Yes
>>
>>>> This way PD driver may
>>>> take an action on detachment if performance isn't zero, before hardware
>>>> is crashed, for example it may emit a warning.
>>>
>>> Not sure I got that. Exactly when do you want to emit a warning and
>>> for what reason?
>>>
>>> Do you want to add a check somewhere to see if
>>> 'gpd_data->performance_state' is non zero - and then print a warning?
>>
>> I want to check the 'gpd_data->performance_state' from the detachment
>> callback and emit the warning + lock further performance changes in the
>> PD driver since it's a error condition.
>
> Alright, so if I understand correctly, you intend to do the check for
> the "error condition" of the device in the genpd->detach_dev()
> callback?

Yes

> What exactly do you intend to do beyond this point, if you detect the
> "error condition"? Locking further changes of the performance state
> seems fragile too, especially if some other device/driver requires the
> performance state to be raised. It sounds like you simply need to call
> BUG_ON() then?

I can lock it to high performance state.

> Also note that a very similar problem exists, *before* the device gets
> attached in the first place. More precisely, nothing prevents the
> performance state from being set to a non-compatible value for an
> always-on HW/device that hasn't been attached yet. So maybe you need
> to set the maximum performance state at genpd initializations, then
> use the ->sync_state() callback to very that all consumers have been
> attached to the genpd provider, before allowing the state to be
> changed/lowered?

That is already done by the PD driver.

https://elixir.bootlin.com/linux/latest/source/drivers/soc/tegra/pmc.c#L3790