Re: [PATCH RFC 00/10] Fix the ABBA locking situation between clk and runtime PM
From: Stephen Boyd
Date: Mon Apr 14 2025 - 21:00:29 EST
Quoting Miquel Raynal (2025-03-26 11:26:15)
> As explained in the following thread, there is a known ABBA locking
> dependency between clk and runtime PM.
> Link: https://lore.kernel.org/linux-clk/20240527181928.4fc6b5f0@xps-13/
>
> The problem is that the clk subsystem uses a mutex to protect concurrent
> accesses to its tree structure, and so do other subsystems such as
> generic power domains. While it holds its own mutex, the clk subsystem
> performs runtime PM calls which end up executing callbacks from other
> subsystems (again, gen PD is in the loop). But typically power domains
> may also need to perform clock related operations, and thus the
> following two situations may happen:
>
> mutex_lock(clk);
> mutex_lock(genpd);
>
> or
>
> mutex_lock(genpd);
> mutex_lock(clk);
>
> As of today I know that at least NXP i.MX8MP and MediaTek MT8183 SoCs
> are complex enough to face this kind of issues.
>
> There's been a first workaround to "silence" lockdep with the most
> obvious case triggering the warning: making sure all clocks are RPM
> enabled before running the clk_disable_unused() work, but this is just
> addressing one situation among many other potentially problematic
> situations. In the past, both Laurent Pinchart and Marek Vasut have
> experienced these issues when enabling HDMI and audio support,
> respectively.
>
> Following a discussion we had at last Plumbers with Steven, I am
> proposing to decouple both locks by changing a bit the clk approach:
> let's always runtime resume all clocks that we *might* need before
> taking the clock lock. But how do we know the list? Well, depending on
> the situation we may either need to wake up:
> - the upper part of the tree during prepare/unprepare operations.
> - the lower part of the tree during (read) rate operations.
> - the upper part and the lower part of the tree otherwise (especially
> during rate changes which may involve reparenting).
Thanks for taking on this work. This problem is coming up more and more
often.
>
> Luckily, we do not need to do that by hand, are more importantly we do
> not need to use the clock tree for that because thanks to the work from
> Saravana, we already have device links describing exhaustively the
> consumer/supplier relationships. The clock topology (from a runtime PM
> perspective) is reflected in these links. In practice, we do not care
> about all consumers, but the few clock operations that will actually
> trigger runtime PM operations are probably not impacting enough to
> justify something more complex.
This won't always work, for a couple reasons. First because clk drivers
aren't required to describe their parent clks that are outside the clk
controller by using DT with a 'clocks' property in the clk controller
node. Second because there can be a many to one relationship between a
struct device and struct device_node. We're trying to push drivers to be
written in a way that the binding has the 'clocks' property, but that
isn't always the case, so we still need a solution that works in all
cases so as to not regress old (legacy?) implementations or ones that
divide a platform device into some number of auxiliary devices and
drivers.
One idea to do that would be to implement the device links between clk
controller devices based on all possible parents of the clk. We support
lazily registering clks though, meaning a parent could be registered at
any time, so we would have to explore the clk tree each time a clk is
registered to see if any new clks need to be found and device links made
between devices. The general algorithm is probably something like:
clk_register()
make_links_for_node()
if device node missing 'clocks' property
for each parent string
pclk = find parent clk
pdev = pclk->dev
link pdev to dev node
else
for each clk in clocks property
pclk = find parent clk
pdev = pclk->dev
link pdev to dev node
We have to get the parent clk in all cases because we don't know which
device it may be registered for (the platform device or auxiliary
device). If we optimize here, I'd prefer we optimize for the case where
the 'clocks' property is present to encourage migration. Maybe what we
can do is make some structure for a clk controller and have a linked
list of those that we look up when a new clk is registered. We actually
have that already with 'struct clock_provider' so maybe we need to
extend that.
Stash the device pointer in there and some variable sized array of the
clk_core pointers to the external clks. In the 'clocks' DT property
case, we can size this immediately and map the array index to the
property but in the non-property case we'll have to grow this array each
time a new clk is found to be a parent of the device. Maybe for that we
should just have some other sort of linked list of clk_core pointers
that we continue to stack clks onto.
struct clock_provider {
void (*clk_init_cb)(struct device_node *);
struct device *dev;
struct device_node *np;
struct list_head node;
struct clk_core *legacy_clks; // Or struct list_head legacy?
size_t len_clocks;
struct clk_core clocks_property[];
};