Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support

From: Michal Wilczynski
Date: Wed Mar 19 2025 - 05:22:36 EST




On 3/13/25 10:25, Philipp Zabel wrote:
> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>
>> On 3/6/25 00:47, Stephen Boyd wrote:
>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>> configurable. In order for these clock gates to work properly, the
>>>> CLKGEN reset must be managed in a specific sequence.
>>>>
>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>> GPU clocks stay physically enabled without external interference from
>>>> the reset driver. The reset is now deasserted only when both core and
>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>
>>>> The mem clock is configured to use nop operations since it cannot be
>>>> controlled.
>>>>
>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@xxxxxxxxxxxxxx [1]
>>>>
>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@xxxxxxxxxxx>
>>> [...]
>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>> index ea96d007aecd..1dfcde867233 100644
>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>> [...]
>>>>
>>>> static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>> video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>> + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>> + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>> +
>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>> +{
>>>> + struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>> + unsigned long flags;
>>>> +
>>>> + spin_lock_irqsave(&gpu_reset_lock, flags);
>>>> +
>>>> + ccu_disable_helper(&cg->common, cg->enable);
>>>> +
>>>> + if ((cg == &gpu_core_clk &&
>>>> + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>> + (cg == &gpu_cfg_aclk &&
>>>> + !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>> + reset_control_assert(gpu_reset);
>>>
>>> Why can't the clk consumer control the reset itself? Doing this here is
>>> not ideal because we hold the clk lock when we try to grab the reset
>>> lock. These are all spinlocks that should be small in lines of code
>>> where the lock is held, but we're calling into an entire other framework
>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>> lock it will deadlock.
>>
>> So in our case the clk consumer is the drm/imagination driver. Here is
>> the comment from the maintainer for my previous attempt to use a reset
>> driver to abstract the GPU init sequence [1]:
>>
>> "Do you know what this resets? From our side, the GPU only has a single
>> reset line (which I assume to be GPU_RESET)."
>>
>> "I don't love that this procedure appears in the platform reset driver.
>> I appreciate it may not be clear from the SoC TRM, but this is the
>> standard reset procedure for all IMG Rogue GPUs. The currently
>> supported TI SoC handles this in silicon, when power up/down requests
>> are sent so we never needed to encode it in the driver before.
>>
>> Strictly speaking, the 32 cycle delay is required between power and
>> clocks being enabled and the reset line being deasserted. If nothing
>> here touches power or clocks (which I don't think it should), the delay
>> could potentially be lifted to the GPU driver."
>>
>> From the drm/imagination maintainers point of view their hardware has
>> only one reset, the extra CLKGEN reset is SoC specific.
>
> If I am understanding correctly, the CLKGEN reset doesn't reset
> anything in the GPU itself, but holds the GPU clock generator block in
> reset, effectively disabling the three GPU clocks as a workaround for
> the always-ungated GPU_MEM clock.
>
>> Also the reset driver maintainer didn't like my way of abstracting two
>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
>
> That is one part of it. The other is that (according to my
> understanding as laid out above), the combined GPU+CLKGEN reset would
> effectively disable all three GPU clocks for a while, after the GPU
> driver has already requested them to be enabled.

Thank you for your comments Philipp, it seems like we're on the same
page here. I was wondering whether there is anything I can do to move the
patches forward.

Stephen, if the current patch is a no go from your perspective could you
please advise whether there is a way to solve this in a clock that would
be acceptable to you.

Thanks,
Michał

>
>> to make it
>> seem to the consumer driver drm/imagination like there is only one
>> reset and suggested and attempt to code the re-setting in the clock
>> driver [2]. Even though he suggested a different way of achieving that:
>>
>> "In my mind it shouldn't be much: the GPU clocks could all share the
>> same refcounted implementation. The first clock to get enabled would
>> ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert
>> GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be
>> no-ops. Would that work?"
>>
>> The above would have similar effect, but I felt like enabling both
>> clocks in single .enable callback could be confusing so I ended up with
>> the current approach. This could be easily re-done if you feel this
>> would be better.
>>
>> I agree that using spinlocks here is dangerous, but looking at the code
>> of the reset_control_deassert and reset_control_assert, it doesn't seem
>> like any spinlocks are acquired/relased in that code flow, unless the
>> driver ops would introduce that. So in this specific case deadlock
>> shouldn't happen ?
>
> There are no spinlocks in the reset_control_(de)assert paths in the
> reset framework, but in general there could be in the driver. The thead
> driver [1], uses regmap_update_bits() on a regmap with .fast_io = true,
> which uses the internal struct regmap::spinlock.
>
> [1] https://lore.kernel.org/all/20250303152511.494405-3-m.wilczynski@xxxxxxxxxxx/
>
>
> regards
> Philipp
>