Re: [PATCH RFC 01/10] PM: runtime: Add helpers to resume consumers

From: Miquel Raynal
Date: Fri Mar 28 2025 - 05:59:53 EST


Hello Rafael,

>> The runtime PM core currently allows to runtime resume/suspend a device,
>> or its suppliers.
>>
>> Let's make it also possible to runtime resume/suspend consumers.
>>
>> Consumers and suppliers are seen here through the description made by
>> device_links.
>
> It would be good to explain why all of this is needed.
>
> I gather that it is used for resolving some synchronization issues in
> the clk framework, but neither the cover letter nor this changelog
> explains how it is used.

The explanation is quite long, there have been already 3 full threads
from people attempting to fix a problem that resides in the clock
subsystem (but that may also be probably problematic in others, just
uncovered so far). I don't know if you took the time to read the cover
letter:
https://lore.kernel.org/linux-clk/20250326-cross-lock-dep-v1-0-3199e49e8652@xxxxxxxxxxx/
It tries to explain the problem and the approach to fix this problem,
but let me try to give a runtime PM focused view of it here.

[Problem]

We do have an ABBA locking situation between clk and any other subsystem
that might be in use during runtime_resume() operations, provided that
these subsystems also make clk calls at some point. The usual suspect
here are power domains.

There are different approaches that can be taken but the one that felt
the most promising when we discussed it during last LPC (and also the
one that was partially implemented in the clk subsystem already for a
tiny portion of it) is the rule that "subsystem locks should not be kept
acquired while calling in some other subsystems".

Typically in the clk subsystem the logic is:

func() {
mutex_lock(clk);
runtime_resume(clk);
...
}

Whereas what would definitely work without locking issues is the
opposite:

func() {
runtime_resume(clk);
mutex_lock(clk);
...
}

Of course life is not so simple, and the clock core is highly
recursive, which means inverting the two calls like I hinted above
simply does not work as we go deeper in the subcalls. As a result, we
need to runtime resume *all* the relevant clocks in advance, before
calling functions recursively (the lock itself is allowed to re-enter
and is not blocking in this case).

I followed all possible paths in the clock subsystem and identified 3
main categories. The list of clocks we need to runtime resume in advance
can either be:
1- the parent clocks
2- the child clocks
3- the parent and child clocks
4- all the clocks (typically for debugfs/sysfs purposes).

[Solution 1: discarded]

The first approach to do that was do to some guessing based on the clock
tree topology. Unfortunately this approach does not stand because it is
virtually unbounded. In order to know the clock topology we must acquire
the clock main lock. In order to runtime resume we must release it. As a
result, this logic is virtually unbounded (even though in practice we
would converge at some point). So this approach was discarded by Steven.

[Solution 2: this proposal]

After the LPC discussion with Steven, I also discussed with Saravana
about this and he pointed that since we were using fw_devlink=rpm by
default now, all providers -including clock controllers of course- would
already be runtime resumed the first time we would make a
runtime_resume(clk), and thus all the nested calls were no longer
needed. This native solution was already addressing point #1 above (and
partially point #3) and all I had to do was to make a similar function
for point #2.

And here we are, trying to resume all consumers (from a device link
perspective) which include, but is not limited to, consumer clocks.

I hope this explanation will help understanding this patch and why it is
needed for this series. As stated in the cover letter, I've tried to
keep the changes here to their minimum. Maybe there are other/better
ways to do that and we can discuss them. My priority is however to get
this possible ABBA deadlock situation sorted out.

I can further expand the commit log with these details if you want.

Thanks,
Miquèl