Re: [PATCH] reset: use a shared SRCU domain for reset controls

From: Steven Price

Date: Thu Apr 23 2026 - 08:50:46 EST


+Heiko for the Rockchip questions.

On 23/04/2026 11:27, Philipp Zabel wrote:
> Hi Steven,
>
> On Fr, 2026-04-17 at 16:48 +0100, Steven Price wrote:
>> Commit 78ebbff6d1a0 ("reset: handle removing supplier before consumers")
>> added a dynamically initialized srcu_struct to every reset_control and
>> cleaned it up again when the handle was dropped.
>>
>> That breaks early boot users which acquire and release reset handles
>> before workqueues are online. On rk3288 this shows up during
>> rockchip_smp_prepare_cpus(), where pmu_set_power_domain() gets a reset
>> control for a CPU core and then drops it again before SMP bring-up has
>> finished.
>
> Can the reset_control_put() call be dropped from pmu_set_power_domain()
> to fix the problem?

I'm not that familiar with the code, so I'm not sure.

Just dropping that call causes a WARN_ON() bringing the secondary CPUs
on (because the call to rockchip_get_core_reset() expects to have
exclusive access to the reset). Switching to a shared reset then his a
WARN_ON() in reset_control_assert because deassert_count == 0. I could
keep digging blindly but I'm not really sure how this code is meant to work.

Hopefully Heiko might be able to shed some more light on this?

> Putting the reset control should mean that the driver doesn't care
> about the state of the reset line anymore, but the platsmp code very
> much expects the reset line to stay deasserted after enabling a CPU.
> Acquiring reset controls in rockchip_smp_prepare_cpus() once and never
> giving them up via reset_control_put() seems like a correct fix,
> regardless of whether this patch is applied or not.
>
> It looks like the meson platsmp suffers from the same issue.

This is why I did the fix in the reset code - how many other platforms
might have similar issues? But obviously if these platforms are buggy
then they should be fixed.

My interest is keeping the devboard working so I can keep testing
Panfrost on it.

Thanks,
Steve

>> cleanup_srcu_struct() then tries to flush delayed SRCU work
>> and hits the WARN_ON(!wq_online) path, which can leave the machine
>> hanging before the serial console appears.
>>
>> Keep the supplier-removal protection, but move it to a single shared
>> static SRCU domain for the reset core. That preserves the rcdev lifetime
>> protection needed for supplier unregister without requiring per-handle
>> init_srcu_struct()/cleanup_srcu_struct() on normal get/put paths.
>
> I'd prefer to document the workqueue requirement and keep the SRCU
> domain per reset_control, if possible.
>
> regards
> Philipp