Re: [PATCH 2/4] clk: renesas: cpg-mssr: Add table-driven MSTP dummy-read delay for LCDC on RZ/T2H
From: Lad, Prabhakar
Date: Mon Jun 08 2026 - 15:31:56 EST
Hi Geert,
Thank you for the review.
On Fri, Jun 5, 2026 at 2:38 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Prabhakar,
>
> On Mon, 11 May 2026 at 21:19, Prabhakar <prabhakar.csengg@xxxxxxxxx> wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
> >
> > Per the RZ/T2H hardware manual, to secure processing after release from
> > the module-stop state, dummy read the same MSTPCRm register at least
> > seven times for most IP blocks, at least 100 times for LCDC and at
> > least 300 times for RTC before proceeding with subsequent processing.
> >
> > The existing udelay(10) satisfies the seven dummy-read requirement for
> > most IP blocks. Extend this to support per-IP dummy-read requirements
> > by introducing a table-driven lookup, rzt2h_mstp_delay_table, where
> > each entry records the MSTPCRm register index, bit position and the
> > minimum dummy-read count from the hardware manual, converted to
> > microseconds via RZT2H_MSTP_READS_TO_US().
> >
> > Introduce cpg_rzt2h_mstp_get_delay_us() to replace the open-coded
> > udelay(10) calls. In cpg_mstp_clock_endisable() the exact register and
> > bit are known so the lookup matches on both fields. In
> > cpg_mssr_resume_noirq() the register is known but not the individual
> > bit, so pass RZT2H_MSTP_ANY_BIT causing the lookup to match on the
> > register alone and return the delay for the first matching entry.
> >
> > Add an entry for LCDC which requires at least 100 dummy reads. Adding
> > support for further IP blocks with non-default requirements only needs
> > a new entry in rzt2h_mstp_delay_table with no logic changes needed.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
>
> Thanks for your patch!
>
> > --- a/drivers/clk/renesas/renesas-cpg-mssr.c
> > +++ b/drivers/clk/renesas/renesas-cpg-mssr.c
> > @@ -96,6 +96,24 @@ static const u16 mstpcr_for_gen4[] = {
> > #define RZT2H_MSTPCR_BLOCK(x) ((x) >> RZT2H_MSTPCR_BLOCK_SHIFT)
> > #define RZT2H_MSTPCR_OFFSET(x) ((x) & RZT2H_MSTPCR_OFFSET_MASK)
> >
> > +/* Dummy read counts as specified by the RZ/T2H hardware manual */
> > +#define RZT2H_MSTP_DEFAULT_DUMMY_READS 7
> > +#define RZT2H_MSTP_LCDC_DUMMY_READS 100
> > +
> > +/*
> > + * Time per dummy read in nanoseconds, derived from the original udelay(10)
> > + * which was used to satisfy the 7 dummy-read requirement:
> > + * 10000 ns / 7 reads = 1429 ns per read.
> > + */
> > +#define RZT2H_MSTP_DUMMY_READ_NS 1429
> > +#define RZT2H_MSTP_READS_TO_US(n) (((n) * RZT2H_MSTP_DUMMY_READ_NS) / 1000)
>
> IMHO this is overly complicated and hard to follow...
>
> > +#define RZT2H_MSTP_DEFAULT_DELAY_US RZT2H_MSTP_READS_TO_US(RZT2H_MSTP_DEFAULT_DUMMY_READS)
>
> i.e. this is just 10 again?
>
Yep.
> > +
> > +#define RZT2H_MSTPCRM_INDEX 12
> > +#define RZT2H_MSTPCRM04_LCDC 4
>
> Everywhere else (DEF_MOD(), DTS) we refer to module clocks using the
> sparse base-10 combined number, i.e. "1204", so I think it would be
> better to do the same here.
>
> > +
> > +#define RZT2H_MSTP_ANY_BIT U32_MAX
> > +
> > static const u16 mstpcr_for_rzt2h[] = {
> > RZT2H_MSTPCR(0, 0x300), /* MSTPCRA */
> > RZT2H_MSTPCR(0, 0x304), /* MSTPCRB */
> > @@ -113,6 +131,35 @@ static const u16 mstpcr_for_rzt2h[] = {
> > RZT2H_MSTPCR(1, 0x334), /* MSTPCRN */
> > };
> >
> > +/**
> > + * struct rzt2h_mstp_delay_entry - MSTP dummy-read requirement for RZ/T2H
> > + *
> > + * @reg: Index into control_regs[]. Exact match.
> > + * @bit: MSTP bit position, or RZT2H_MSTP_ANY_BIT for register-level match.
> > + * @delay_us: Computed delay in microseconds to satisfy the dummy read requirement.
> > + */
> > +struct rzt2h_mstp_delay_entry {
> > + u32 reg;
> > + u32 bit;
>
> The sparse base-10 combined number or packed index would need just a
> single u32...
>
> > + u32 delay_us;
> > +};
> > +
> > +/*
> > + * Per RZ/T2H HW manual: to secure processing after release from the
> > + * module-stop state, dummy read the same register at least seven times
> > + * (except RTC and LCDC) after writing to initiate release from the
> > + * module-stop state. For RTC, dummy read at least 300 times and for
> > + * LCDC, at least 100 times.
> > + *
> > + * Instead of performing the actual dummy reads, an equivalent delay is
> > + * added using udelay(), computed from the required read count via
> > + * RZT2H_MSTP_READS_TO_US().
> > + */
> > +static const struct rzt2h_mstp_delay_entry rzt2h_mstp_delay_table[] = {
> > + { RZT2H_MSTPCRM_INDEX, RZT2H_MSTPCRM04_LCDC,
> > + RZT2H_MSTP_READS_TO_US(RZT2H_MSTP_LCDC_DUMMY_READS) },
>
> "210" (us)?
>
142us
> > +};
> > +
> > /*
> > * Standby Control Register offsets (RZ/A)
> > * Base address is FRQCR register
> > @@ -253,6 +300,20 @@ static void cpg_rzt2h_mstp_write(struct cpg_mssr_priv *priv, u16 offset, u32 val
> > writel(value, base + RZT2H_MSTPCR_OFFSET(offset));
> > }
> >
> > +static unsigned int cpg_rzt2h_mstp_get_delay_us(u32 reg, u32 bit)
> > +{
> > + unsigned int i;
> > +
> > + for (i = 0; i < ARRAY_SIZE(rzt2h_mstp_delay_table); i++) {
> > + const struct rzt2h_mstp_delay_entry *e = &rzt2h_mstp_delay_table[i];
> > +
> > + if (e->reg == reg && (e->bit == bit || bit == RZT2H_MSTP_ANY_BIT))
> > + return e->delay_us;
> > + }
>
> Given there are only two modules (LCDC and RTC) that need special
> handling, a table sounds like overkill to me.
> For exact matching, a switch() statement with two entries and a default
> would do.
> For wildcard bit matching, perhaps you can use a mask?
>
> unsigned int mask = bit_valid ? GENMASK(31, 0) : GENMASK(31, 5));
>
> if (idx == (MOD_CLK_PACK(1204) & mask)) {
> /* LCDC needs 300 dummy reads, or 210 us */
> return 210;
> } else if (idx == (MOD_CLK_PACK(605) & mask)) {
> /* RTC needs 100 dummy reads, or 70 us */
> return 70;
> } else {
> /* default 7 dummy reads, or 10 us */
> return 10;
> }
>
> What do you think?
>
agreed, this approach would much simpler.
> > +
> > + return RZT2H_MSTP_DEFAULT_DELAY_US;
>
> You might as well just do the udelay() here, too.
>
Ok I will have the below,
static void cpg_rzt2h_mstp_delay(u32 idx, bool bit_valid)
{
unsigned int mask = bit_valid ? GENMASK(31, 0) : GENMASK(31, 5);
if (idx == (MOD_CLK_PACK(1204) & mask)) {
/* LCDC needs 100 dummy reads, or 142us */
udelay(142);
} else if (idx == (MOD_CLK_PACK(605) & mask)) {
/* RTC needs 300 dummy reads, or 428us */
udelay(428);
} else {
/* default 7 dummy reads, or 10us */
udelay(10);
}
}
> > +}
> > +
> > static int cpg_mstp_clock_endisable(struct clk_hw *hw, bool enable)
> > {
> > struct mstp_clock *clock = to_mstp_clock(hw);
> > @@ -312,7 +373,7 @@ static int cpg_mstp_clock_endisable(struct clk_hw *hw, bool enable)
> > * register, we simply add a delay after the read operation.
> > */
> > cpg_rzt2h_mstp_read(priv, priv->control_regs[reg]);
> > - udelay(10);
> > + udelay(cpg_rzt2h_mstp_get_delay_us(reg, bit));
>
> In this function, you do have the packed clock index (clock->index).
>
Agreed, this can be replaced with cpg_rzt2h_mstp_delay(clock->index, true);
> > return 0;
> > }
> >
> > @@ -1142,7 +1203,7 @@ static int cpg_mssr_resume_noirq(struct device *dev)
> > cpg_rzt2h_mstp_write(priv, priv->control_regs[reg], newval);
> > /* See cpg_mstp_clock_endisable() on why this is necessary. */
> > cpg_rzt2h_mstp_read(priv, priv->control_regs[reg]);
> > - udelay(10);
> > + udelay(cpg_rzt2h_mstp_get_delay_us(reg, RZT2H_MSTP_ANY_BIT));
>
> Here you don't have it, but idx = reg * 32, and bit_valid = false;
>
and this one with cpg_rzt2h_mstp_delay(reg * 32, false);
Cheers,
Prabhakar