Re: [PATCH] pmdomain: mediatek: fix race condition in power on/power off sequences
From: Eugen Hristev
Date: Wed Nov 29 2023 - 09:48:37 EST
On 11/29/23 16:41, AngeloGioacchino Del Regno wrote:
> Il 29/11/23 14:48, Eugen Hristev ha scritto:
>> On 11/29/23 15:37, AngeloGioacchino Del Regno wrote:
>>> Il 29/11/23 14:28, Eugen Hristev ha scritto:
>>>> On 11/29/23 14:52, AngeloGioacchino Del Regno wrote:
>>>>> Il 29/11/23 12:31, Eugen Hristev ha scritto:
>>>>>> It can happen that during the power off sequence for a power domain
>>>>>> another power on sequence is started, and it can lead to powering on and
>>>>>> off in the same time for the similar power domain.
>>>>>> This can happen if parallel probing occurs: one device starts probing, and
>>>>>> one power domain is probe deferred, this leads to all power domains being
>>>>>> rolled back and powered off, while in the same time another device starts
>>>>>> probing and requests powering on the same power domains or similar.
>>>>>>
>>>>>> This was encountered on MT8186, when the sequence is :
>>>>>> Power on SSUSB
>>>>>> Power on SSUSB_P1
>>>>>> Power on DIS
>>>>>> -> probe deferred
>>>>>> Power off DIS
>>>>>> Power off SSUSB_P1
>>>>>> Power off SSUSB
>>>>>>
>>>>>> During the sequence of powering off SSUSB, some new similar sequence starts,
>>>>>> and during the power on of SSUSB, clocks are enabled.
>>>>>> In this case, powering off SSUSB fails from the first sequence, because
>>>>>> power off ACK bit check times out (as clocks are powered back on by the second
>>>>>> sequence). In consequence, powering it on also times out, and it leads to
>>>>>> the whole power domain in a bad state.
>>>>>>
>>>>>> To solve this issue, added a mutex that locks the whole power off/power on
>>>>>> sequence such that it would never happen that multiple sequences try to
>>>>>> enable or disable the same power domain in parallel.
>>>>>>
>>>>>> Fixes: 59b644b01cf4 ("soc: mediatek: Add MediaTek SCPSYS power domains")
>>>>>> Signed-off-by: Eugen Hristev <eugen.hristev@xxxxxxxxxxxxx>
>>>>>
>>>>> I don't think that it's a race between genpd_power_on() and genpd_power_off()
>>>>> calls
>>>>> at all, because genpd *does* have locking after all... at least for probe and for
>>>>> parents of a power domain (and more anyway).
>>>>>
>>>>> As far as I remember, what happens when you start .probe()'ing a device is:
>>>>> platform_probe() -> dev_pm_domain_attach() -> genpd_dev_pm_attach()
>>>>>
>>>>> There, you end up with
>>>>>
>>>>> if (power_on) {
>>>>> genpd_lock(pd);
>>>>> ret = genpd_power_on(pd, 0);
>>>>> genpd_unlock(pd);
>>>>> }
>>>>>
>>>>> ...but when you fail probing, you go with genpd_dev_pm_detach(), which then calls
>>>>>
>>>>> /* Check if PM domain can be powered off after removing this device. */
>>>>> genpd_queue_power_off_work(pd);
>>>>>
>>>>> but even then, you end up being in a worker doing
>>>>>
>>>>> genpd_lock(genpd);
>>>>> genpd_power_off(genpd, false, 0);
>>>>> genpd_unlock(genpd);
>>>>>
>>>>> ...so I don't understand why this mutex can resolve the situation here (also: are
>>>>> you really sure that the race is solved like that?)
>>>>>
>>>>> I'd say that this probably needs more justification and a trace of the actual
>>>>> situation here.
>>>>>
>>>>> Besides, if this really resolves the issue, I would prefer seeing variants of
>>>>> scpsys_power_{on,off}() functions, because we anyway don't need to lock mutexes
>>>>> during this driver's probe (add_subdomain calls scpsys_power_on()).
>>>>> In that case, `scpsys_power_on_unlocked()` would be an idea... but still, please
>>>>> analyze why your solution works, if it does, because I'm not convinced.
>>>>
>>>> What I see in my tests, is that a power on call for SSUSB domain happens while
>>>> the previous power off sequence did not yet complete, most likely while it's
>>>> waiting in readx_poll_timeout . This leads to inconsistency of the power domain,
>>>> not getting the ACKs next time a power on attempt occurs.
>>>>
>>>> I understand what you say about locks, but in this case the powering off is not
>>>> called by the genpd itself, but rather it's called by the rollback probe failed
>>>> mechanism : when the probing fails, scpsys_domain_cleanup() is called during the
>>>> same probing session.
>>>> Then it happens that probing begins again and previous cleanup is not yet
>>>> completed. I am not sure whether the lock is still held from the previous run,
>>>> but it's clearly not waiting for a lock to be released to be called again.
>>>>
>>>
>>> Sorry but I'm a bit lost now: is the problem about probe deferrals of the USB
>>> driver, or about probe deferrals of the mtk-pm-domains driver?
>>>
>>> scpsys_domain_cleanup() is only called upon scpsys_probe() failure.
>>
>> You are right, my explanation was bad.
>>
>> It happens during the mtk-pm-domains driver probe.
>>
>> Not all domains can power up, then everything is rolled back. and this happens
>> multiple times
>> On rare occasions, it happens that another probing sequence starts while the
>> previous one was not finished .
>> I mentioned devices because I had in mind the fact that each device requires a
>> power domain, and parallel probing of these devices causes a call to mtk-pm-domains
>> driver probe to be called from two different places.
>>
>> e.g. device 1 probes -> call mtk-pm-domains probe because it requires X power domain
>>
>> device 2 probes -> call mtk-pm-domains probe because it requires Y power domain.
>>
>> First attempt fails but not completed while second attempt starts.
>>
>> Maybe this is a better explanation of the situation ?
>
> Yeah, now it's a bit clearer!
>
> At this point, I think that you can get away with locking just one path (or two):
>
> /* This is the one giving me lots of suspects */
> static void scpsys_remove_one_domain(struct scpsys_domain *pd)
> {
> int ret;
>
> ***lock***
> if (scpsys_domain_is_on(pd))
> scpsys_power_off(&pd->genpd);
> ***unlock***
>
> .....
> }
>
> /* This one as well eventually */
> static struct
> generic_pm_domain *scpsys_add_one_domain(struct scpsys *scpsys, struct device_node
> *node)
> {
> ...............
>
> if (MTK_SCPD_CAPS(pd, MTK_SCPD_KEEP_DEFAULT_OFF)) {
>
> if (scpsys_domain_is_on(pd)) /* Maybe LOCK this one too? */
>
> dev_warn(scpsys->dev,
> "%pOF: A default off power domain has been ON\n", node);
> } else {
> ** *** lock ***
> ret = scpsys_power_on(&pd->genpd);
> ** *** unlock ***
> if (ret < 0) {
> dev_err(scpsys->dev, "%pOF: failed to power on domain: %d\n", node, ret);
> goto err_put_subsys_clocks;
> }
>
> if (MTK_SCPD_CAPS(pd, MTK_SCPD_ALWAYS_ON))
> pd->genpd.flags |= GENPD_FLAG_ALWAYS_ON;
> }
>
> ..........
> }
>
> Can you please try locking only the remove_one_domain() poweroff call before
> trying both?
>
> Reason is that in the add_one_domain() case, we haven't registered the power
> domain yet, so locking may not be required there to make things ticking right.
We need to have 'critical section' all the access to clocks/registers, so both
paths (power on and power off) must be serialized. So a mutex must be taken before
entering either of those, otherwise it makes no sense. There must be no path left
that leads to power on/ power off that is taken without the mutex.
>
> Cheers!
>
>>>
>>>>>
>>>>>> ---
>>>>>> drivers/pmdomain/mediatek/mtk-pm-domains.c | 24 +++++++++++++++++-----
>>>>>> 1 file changed, 19 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c
>>>>>> b/drivers/pmdomain/mediatek/mtk-pm-domains.c
>>>>>> index d5f0ee05c794..4f136b47e539 100644
>>>>>> --- a/drivers/pmdomain/mediatek/mtk-pm-domains.c
>>>>>> +++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c
>>>>>> @@ -9,6 +9,7 @@
>>>>>> #include <linux/io.h>
>>>>>> #include <linux/iopoll.h>
>>>>>> #include <linux/mfd/syscon.h>
>>>>>> +#include <linux/mutex.h>
>>>>>> #include <linux/of.h>
>>>>>> #include <linux/of_clk.h>
>>>>>> #include <linux/platform_device.h>
>>>>>> @@ -56,6 +57,7 @@ struct scpsys {
>>>>>> struct device *dev;
>>>>>> struct regmap *base;
>>>>>> const struct scpsys_soc_data *soc_data;
>>>>>> + struct mutex mutex;
>>>>>> struct genpd_onecell_data pd_data;
>>>>>> struct generic_pm_domain *domains[];
>>>>>> };
>>>>>> @@ -238,9 +240,13 @@ static int scpsys_power_on(struct generic_pm_domain *genpd)
>>>>>> bool tmp;
>>>>>> int ret;
>>>>>> + mutex_lock(&scpsys->mutex);
>>>>>> +
>>>>>> ret = scpsys_regulator_enable(pd->supply);
>>>>>> - if (ret)
>>>>>> + if (ret) {
>>>>>> + mutex_unlock(&scpsys->mutex);
>>>>>> return ret;
>>>>>> + }
>>>>>> ret = clk_bulk_prepare_enable(pd->num_clks, pd->clks);
>>>>>> if (ret)
>>>>>> @@ -291,6 +297,7 @@ static int scpsys_power_on(struct generic_pm_domain *genpd)
>>>>>> goto err_enable_bus_protect;
>>>>>> }
>>>>>> + mutex_unlock(&scpsys->mutex);
>>>>>> return 0;
>>>>>> err_enable_bus_protect:
>>>>>> @@ -305,6 +312,7 @@ static int scpsys_power_on(struct generic_pm_domain *genpd)
>>>>>> clk_bulk_disable_unprepare(pd->num_clks, pd->clks);
>>>>>> err_reg:
>>>>>> scpsys_regulator_disable(pd->supply);
>>>>>> + mutex_unlock(&scpsys->mutex);
>>>>>> return ret;
>>>>>> }
>>>>>> @@ -315,13 +323,15 @@ static int scpsys_power_off(struct generic_pm_domain
>>>>>> *genpd)
>>>>>> bool tmp;
>>>>>> int ret;
>>>>>> + mutex_lock(&scpsys->mutex);
>>>>>> +
>>>>>> ret = scpsys_bus_protect_enable(pd);
>>>>>> if (ret < 0)
>>>>>> - return ret;
>>>>>> + goto err_mutex_unlock;
>>>>>> ret = scpsys_sram_disable(pd);
>>>>>> if (ret < 0)
>>>>>> - return ret;
>>>>>> + goto err_mutex_unlock;
>>>>>> if (pd->data->ext_buck_iso_offs && MTK_SCPD_CAPS(pd,
>>>>>> MTK_SCPD_EXT_BUCK_ISO))
>>>>>> regmap_set_bits(scpsys->base, pd->data->ext_buck_iso_offs,
>>>>>> @@ -340,13 +350,15 @@ static int scpsys_power_off(struct generic_pm_domain
>>>>>> *genpd)
>>>>>> ret = readx_poll_timeout(scpsys_domain_is_on, pd, tmp, !tmp,
>>>>>> MTK_POLL_DELAY_US,
>>>>>> MTK_POLL_TIMEOUT);
>>>>>> if (ret < 0)
>>>>>> - return ret;
>>>>>> + goto err_mutex_unlock;
>>>>>> clk_bulk_disable_unprepare(pd->num_clks, pd->clks);
>>>>>> scpsys_regulator_disable(pd->supply);
>>>>>> - return 0;
>>>>>> +err_mutex_unlock:
>>>>>> + mutex_unlock(&scpsys->mutex);
>>>>>> + return ret;
>>>>>> }
>>>>>> static struct
>>>>>> @@ -700,6 +712,8 @@ static int scpsys_probe(struct platform_device *pdev)
>>>>>> return PTR_ERR(scpsys->base);
>>>>>> }
>>>>>> + mutex_init(&scpsys->mutex);
>>>>>> +
>>>>>> ret = -ENODEV;
>>>>>> for_each_available_child_of_node(np, node) {
>>>>>> struct generic_pm_domain *domain;
>>>>>
>>>
>