Re: [PATCH] regulator: simplify locking

From: Dmitry Osipenko
Date: Mon Aug 10 2020 - 01:22:30 EST


10.08.2020 03:59, Michał Mirosław пишет:
> On Mon, Aug 10, 2020 at 03:21:47AM +0300, Dmitry Osipenko wrote:
>> 10.08.2020 01:30, Michał Mirosław пишет:
>>> On Mon, Aug 10, 2020 at 12:40:04AM +0300, Dmitry Osipenko wrote:
>>>> 10.08.2020 00:16, Michał Mirosław пишет:
>>>>> Simplify regulator locking by removing locking around locking. rdev->ref
>>>>> is now accessed only when the lock is taken. The code still smells fishy,
>>>>> but now its obvious why.
>>>>>
>>>>> Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")
>>>>> Signed-off-by: Michał Mirosław <mirq-linux@xxxxxxxxxxxx>
>>>>> ---
>>>>> drivers/regulator/core.c | 37 ++++++--------------------------
>>>>> include/linux/regulator/driver.h | 1 -
>>>>> 2 files changed, 6 insertions(+), 32 deletions(-)
>>>>>
>>>>> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
>>>>> index 9e18997777d3..b0662927487c 100644
>>>>> --- a/drivers/regulator/core.c
>>>>> +++ b/drivers/regulator/core.c
>>>>> @@ -45,7 +45,6 @@
>>>>> pr_debug("%s: " fmt, rdev_get_name(rdev), ##__VA_ARGS__)
>>>>>
>>>>> static DEFINE_WW_CLASS(regulator_ww_class);
>>>>> -static DEFINE_MUTEX(regulator_nesting_mutex);
>>>>> static DEFINE_MUTEX(regulator_list_mutex);
>>>>> static LIST_HEAD(regulator_map_list);
>>>>> static LIST_HEAD(regulator_ena_gpio_list);
>>>>> @@ -150,32 +149,13 @@ static bool regulator_ops_is_valid(struct regulator_dev *rdev, int ops)
>>>>> static inline int regulator_lock_nested(struct regulator_dev *rdev,
>>>>> struct ww_acquire_ctx *ww_ctx)
>>>>> {
>>>>> - bool lock = false;
>>>>> int ret = 0;
>>>>>
>>>>> - mutex_lock(&regulator_nesting_mutex);
>>>>> + if (ww_ctx || !mutex_trylock_recursive(&rdev->mutex.base))
>>>>
>>>> Have you seen comment to the mutex_trylock_recursive()?
>>>>
>>>> https://elixir.bootlin.com/linux/v5.8/source/include/linux/mutex.h#L205
>>>>
>>>> * This function should not be used, _ever_. It is purely for hysterical GEM
>>>> * raisins, and once those are gone this will be removed.
>>>>
>>>> I knew about this function and I don't think it's okay to use it, hence
>>>> this is why there is that "nesting_mutex" and "owner" checking.
>>>>
>>>> If you disagree, then perhaps you should make another patch to remove
>>>> the stale comment to trylock_recursive().
>>>
>>> I think that reimplementing the function just to not use it is not the
>>> right solution. The whole locking protocol is problematic and this patch
>>> just uncovers one side of it.
>>
>> It's not clear to me what is uncovered, the ref_cnt was always accessed
>> under lock. Could you please explain in a more details?
>>
>> Would be awesome if you could improve the code, but then you should
>> un-deprecate the trylock_recursive() before making use of it. Maybe
>> nobody will mind and it all will be good in the end.
>
> I'm not sure why the framework wants recursive locking? If only for the
> coupling case, then ww_mutex seems the right direction to replace it:
> while walking the graph it will detect entering the same node
> a second time. But this works only during the locking transaction (with
> ww_context != NULL). Allowing recursive regulator_lock() outside of it
> seems inviting trouble.

Yes, it's for the coupling case. Coupled regulators may have common
ancestors and then the whole sub-tree needs to be locked while operating
with a coupled regulator.

The nested locking usage is discouraged in general because it is a
source of bugs. I guess it should be possible to get rid of all nested
lockings in the regulator core and use a pure ww_mutex, but somebody
should dedicate time to work on it.