Re: regulator: deadlock vs memory reclaim

From: Mark Brown
Date: Mon Aug 10 2020 - 13:32:06 EST


On Mon, Aug 10, 2020 at 06:09:36PM +0200, Michał Mirosław wrote:
> On Mon, Aug 10, 2020 at 04:39:28PM +0100, Mark Brown wrote:
> > On Mon, Aug 10, 2020 at 12:25:37AM +0200, Michał Mirosław wrote:

> > > regulator_lock_dependent() starts by taking regulator_list_mutex, The
> > > same mutex covers eg. regulator initialization, including memory allocations
> > > that happen there. This will deadlock when you have filesystem on eg. eMMC
> > > (which uses a regulator to control module voltages) and you register
> > > a new regulator (hotplug a device?) when under memory pressure.

> > OK, that's very much a corner case, it only applies in the case of
> > coupled regulators. The most obvious thing here would be to move the
> > allocations on registration out of the locked region, we really only
> > need this in the regulator_find_coupler() call I think. If the
> > regulator isn't coupled we don't need to take the lock at all.

> Currently, regulator_lock_dependent() is called by eg. regulator_enable() and
> regulator_get_voltage(), so actually any regulator can deadlock this way.

The initialization cases that are the trigger are only done for coupled
regulators though AFAICT, otherwise we're not doing allocations with the
lock held and should be able to progress.

> > > Basically, we have a BKL for regulator_enable() and we're using ww_mutex
> > > as a recursive mutex with no deadlock prevention whatsoever. The locks
> > > also seem to cover way to much (eg. initialization even before making the
> > > regulator visible to the system).

> > Could you be more specific about what you're looking at here? There's
> > nothing too obvious jumping out from the code here other than the bit
> > around the coupling allocation, otherwise it looks like we're locking
> > list walks.

> When you look at the regulator API (regulator_enable() and friends),
> then in their implementation we always start by .._lock_dependent(),
> which takes regulator_list_mutex around its work. This mutex is what
> makes the code deadlock-prone vs memory allocations. I have a feeling
> that this lock is a workaround for historical requirements (recursive
> locking of regulator_dev) that might be no longer needed or is just
> too defensive programming. Hence my other patches and this inquiry.

We need to both walk up the tree for operations that involve the parent
and rope in any peers that are coupled, the idea is basically to avoid
changes to the coupling while we're trying to figure that out.

Attachment: signature.asc
Description: PGP signature