Re: [PATCH 2/4] clk: renesas: cpg-mssr: Add table-driven MSTP dummy-read delay for LCDC on RZ/T2H

From: Geert Uytterhoeven

Date: Fri Jun 05 2026 - 09:39:08 EST


Hi Prabhakar,

On Mon, 11 May 2026 at 21:19, Prabhakar <prabhakar.csengg@xxxxxxxxx> wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
>
> Per the RZ/T2H hardware manual, to secure processing after release from
> the module-stop state, dummy read the same MSTPCRm register at least
> seven times for most IP blocks, at least 100 times for LCDC and at
> least 300 times for RTC before proceeding with subsequent processing.
>
> The existing udelay(10) satisfies the seven dummy-read requirement for
> most IP blocks. Extend this to support per-IP dummy-read requirements
> by introducing a table-driven lookup, rzt2h_mstp_delay_table, where
> each entry records the MSTPCRm register index, bit position and the
> minimum dummy-read count from the hardware manual, converted to
> microseconds via RZT2H_MSTP_READS_TO_US().
>
> Introduce cpg_rzt2h_mstp_get_delay_us() to replace the open-coded
> udelay(10) calls. In cpg_mstp_clock_endisable() the exact register and
> bit are known so the lookup matches on both fields. In
> cpg_mssr_resume_noirq() the register is known but not the individual
> bit, so pass RZT2H_MSTP_ANY_BIT causing the lookup to match on the
> register alone and return the delay for the first matching entry.
>
> Add an entry for LCDC which requires at least 100 dummy reads. Adding
> support for further IP blocks with non-default requirements only needs
> a new entry in rzt2h_mstp_delay_table with no logic changes needed.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>

Thanks for your patch!

> --- a/drivers/clk/renesas/renesas-cpg-mssr.c
> +++ b/drivers/clk/renesas/renesas-cpg-mssr.c
> @@ -96,6 +96,24 @@ static const u16 mstpcr_for_gen4[] = {
> #define RZT2H_MSTPCR_BLOCK(x) ((x) >> RZT2H_MSTPCR_BLOCK_SHIFT)
> #define RZT2H_MSTPCR_OFFSET(x) ((x) & RZT2H_MSTPCR_OFFSET_MASK)
>
> +/* Dummy read counts as specified by the RZ/T2H hardware manual */
> +#define RZT2H_MSTP_DEFAULT_DUMMY_READS 7
> +#define RZT2H_MSTP_LCDC_DUMMY_READS 100
> +
> +/*
> + * Time per dummy read in nanoseconds, derived from the original udelay(10)
> + * which was used to satisfy the 7 dummy-read requirement:
> + * 10000 ns / 7 reads = 1429 ns per read.
> + */
> +#define RZT2H_MSTP_DUMMY_READ_NS 1429
> +#define RZT2H_MSTP_READS_TO_US(n) (((n) * RZT2H_MSTP_DUMMY_READ_NS) / 1000)

IMHO this is overly complicated and hard to follow...

> +#define RZT2H_MSTP_DEFAULT_DELAY_US RZT2H_MSTP_READS_TO_US(RZT2H_MSTP_DEFAULT_DUMMY_READS)

i.e. this is just 10 again?

> +
> +#define RZT2H_MSTPCRM_INDEX 12
> +#define RZT2H_MSTPCRM04_LCDC 4

Everywhere else (DEF_MOD(), DTS) we refer to module clocks using the
sparse base-10 combined number, i.e. "1204", so I think it would be
better to do the same here.

> +
> +#define RZT2H_MSTP_ANY_BIT U32_MAX
> +
> static const u16 mstpcr_for_rzt2h[] = {
> RZT2H_MSTPCR(0, 0x300), /* MSTPCRA */
> RZT2H_MSTPCR(0, 0x304), /* MSTPCRB */
> @@ -113,6 +131,35 @@ static const u16 mstpcr_for_rzt2h[] = {
> RZT2H_MSTPCR(1, 0x334), /* MSTPCRN */
> };
>
> +/**
> + * struct rzt2h_mstp_delay_entry - MSTP dummy-read requirement for RZ/T2H
> + *
> + * @reg: Index into control_regs[]. Exact match.
> + * @bit: MSTP bit position, or RZT2H_MSTP_ANY_BIT for register-level match.
> + * @delay_us: Computed delay in microseconds to satisfy the dummy read requirement.
> + */
> +struct rzt2h_mstp_delay_entry {
> + u32 reg;
> + u32 bit;

The sparse base-10 combined number or packed index would need just a
single u32...

> + u32 delay_us;
> +};
> +
> +/*
> + * Per RZ/T2H HW manual: to secure processing after release from the
> + * module-stop state, dummy read the same register at least seven times
> + * (except RTC and LCDC) after writing to initiate release from the
> + * module-stop state. For RTC, dummy read at least 300 times and for
> + * LCDC, at least 100 times.
> + *
> + * Instead of performing the actual dummy reads, an equivalent delay is
> + * added using udelay(), computed from the required read count via
> + * RZT2H_MSTP_READS_TO_US().
> + */
> +static const struct rzt2h_mstp_delay_entry rzt2h_mstp_delay_table[] = {
> + { RZT2H_MSTPCRM_INDEX, RZT2H_MSTPCRM04_LCDC,
> + RZT2H_MSTP_READS_TO_US(RZT2H_MSTP_LCDC_DUMMY_READS) },

"210" (us)?

> +};
> +
> /*
> * Standby Control Register offsets (RZ/A)
> * Base address is FRQCR register
> @@ -253,6 +300,20 @@ static void cpg_rzt2h_mstp_write(struct cpg_mssr_priv *priv, u16 offset, u32 val
> writel(value, base + RZT2H_MSTPCR_OFFSET(offset));
> }
>
> +static unsigned int cpg_rzt2h_mstp_get_delay_us(u32 reg, u32 bit)
> +{
> + unsigned int i;
> +
> + for (i = 0; i < ARRAY_SIZE(rzt2h_mstp_delay_table); i++) {
> + const struct rzt2h_mstp_delay_entry *e = &rzt2h_mstp_delay_table[i];
> +
> + if (e->reg == reg && (e->bit == bit || bit == RZT2H_MSTP_ANY_BIT))
> + return e->delay_us;
> + }

Given there are only two modules (LCDC and RTC) that need special
handling, a table sounds like overkill to me.
For exact matching, a switch() statement with two entries and a default
would do.
For wildcard bit matching, perhaps you can use a mask?

unsigned int mask = bit_valid ? GENMASK(31, 0) : GENMASK(31, 5));

if (idx == (MOD_CLK_PACK(1204) & mask)) {
/* LCDC needs 300 dummy reads, or 210 us */
return 210;
} else if (idx == (MOD_CLK_PACK(605) & mask)) {
/* RTC needs 100 dummy reads, or 70 us */
return 70;
} else {
/* default 7 dummy reads, or 10 us */
return 10;
}

What do you think?

> +
> + return RZT2H_MSTP_DEFAULT_DELAY_US;

You might as well just do the udelay() here, too.

> +}
> +
> static int cpg_mstp_clock_endisable(struct clk_hw *hw, bool enable)
> {
> struct mstp_clock *clock = to_mstp_clock(hw);
> @@ -312,7 +373,7 @@ static int cpg_mstp_clock_endisable(struct clk_hw *hw, bool enable)
> * register, we simply add a delay after the read operation.
> */
> cpg_rzt2h_mstp_read(priv, priv->control_regs[reg]);
> - udelay(10);
> + udelay(cpg_rzt2h_mstp_get_delay_us(reg, bit));

In this function, you do have the packed clock index (clock->index).

> return 0;
> }
>
> @@ -1142,7 +1203,7 @@ static int cpg_mssr_resume_noirq(struct device *dev)
> cpg_rzt2h_mstp_write(priv, priv->control_regs[reg], newval);
> /* See cpg_mstp_clock_endisable() on why this is necessary. */
> cpg_rzt2h_mstp_read(priv, priv->control_regs[reg]);
> - udelay(10);
> + udelay(cpg_rzt2h_mstp_get_delay_us(reg, RZT2H_MSTP_ANY_BIT));

Here you don't have it, but idx = reg * 32, and bit_valid = false;

> continue;
> } else
> writel(newval, priv->pub.base0 + priv->control_regs[reg]);

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds