Re: Marvell NFC timings on CN9130

From: Miquel Raynal
Date: Mon May 22 2023 - 05:04:16 EST


Hi Chris,

Chris.Packham@xxxxxxxxxxxxxxxxxxx wrote on Mon, 22 May 2023 04:53:54
+0000:

> On 22/05/23 10:53, Chris Packham wrote:
> >
> > On 17/05/23 14:22, Chris Packham wrote:
> >>
> >> On 17/05/23 05:25, Miquel Raynal wrote:
> >>> Hi Chris!
> >>>
> >>> Chris.Packham@xxxxxxxxxxxxxxxxxxx wrote on Tue, 16 May 2023 04:46:38
> >>> +0000:
> >>>
> >>>> Hi Miquel, Thomas,
> >>>>
> >>>> A hardware colleague reported a concern to me about a new design we
> >>>> have
> >>>> using the Marvell CN9130 SoC (which I think was called Armada-8K
> >>>> before
> >>>> they rebranded).
> >>>>
> >>>> Basically their concern is that the tWC timing they observe is faster
> >>>> (~18ns) than the documented minimum in the hardware datasheet for the
> >>>> CN9130 (25ns). Aside from not meeting the datasheet spec we've not
> >>>> observed any other issue (yet).
> >>> I would have expected the controller to support almost any kind of
> >>> timings, including SDR EDO mode 5. tWC is 25ns with mode 4, but 20 on
> >>> mode 5 (ONFI). So I believe you're running a system with a chip that is
> >>> not compatible with the fastest mode. If that is the case, it may
> >>> explain why you don't see errors with this chip: it may support
> >>> slightly faster timings than it advertises.
> >>>
> >>> Anyway, if your findings are true, it means the current implementation
> >>> is slightly out of spec and the timing calculation might require to be
> >>> tweaked a little bit to reduce tWC.
> >>>
> >>>> I notice in the marvell_nand.c driver that marvell_nfc_init() sets the
> >>>> NAND Clock Frequency Select bit (0xF2440700:0) to 1 which runs
> >>>> according
> >>>> to the datasheet the NAND flash at 400MHz . But the calculations in
> >>>> marvell_nfc_setup_interface() use the value from
> >>>> clk_get_rate(nfc->core_clk) which is still 250MHz so I'm wondering if
> >>>> maybe the fact that the NAND flash is being run faster is having an
> >>>> impact on timings that are calculated around the core_clk frequency.
> >>> What if you reset this bit? Do you observe different timings? I hope
> >>> you do, otherwise this is a dead-end.
> >> Yes if we clear the bit the timings go from ~18ns to about 30ns.
> >>> The timings are derived from this clock but I remember seeing different
> >>> rates than the ones I expected with no obvious explanation (see the "*
> >>> 2" in the calculation of period_ns and the comment right below). So
> >>> maybe this is due to the 400MHz vs. 250MHz issue you are reporting, or
> >>> there is an undocumented pre-scaler in-between (this is my original
> >>> guess).
> >>
> >> I wondered if the * 2 was because of this or because of the comment
> >> that the ECC_CLK is 2*NF_CLK. That probably also means that a number
> >> of SoCs are running with an extra *2 that don't need to be (e.g.
> >> Armada-385).
> > Interestingly cp110-system-controller.c is aware of the 400MHz option
> > but that's only effective if it's been set prior to the kernel
> > booting. I'm not really familiar with clk drivers but I assume it must
> > be possible to make it look up the frequency dynamically instead of
> > using a single fixed value.
> >>
> >>>> Do you think that the timings calculations should take the NAND Clock
> >>>> Frequency Select setting into account?
> >>> There is not much about this clock in the manual, so if the clock is
> >>> feeding the logic of the controller generating the signals on the bus,
> >>> then yes. You can verify this with the test mentioned above.
> >>>
> >>> Could you check the values set to tWP and tWH with and without the bit
> >>> and probe the signals in both cases? Maybe the "* 2" in the
> >>> period_ns calculation will vanish if we use 400MHz as input clock
> >>> rather
> >>> than clk_get_rate() (or better, expose the bit as a mux-clock and use
> >>> it to tell the CCF the right frequency) and you'll get a sharper tWC in
> >>> the end, which hopefully should match the spec this time.
> >>
> >> I was going to have a look to see if I can get the NAND clock to
> >> correctly reflect the value when the NAND Clock Frequency Select bit
> >> is set. In the meantime I'll also do some experiments removing the *
> >> 2 and hard-coding the frequency at 400MHz.
>
> I learnt something over the course of the day. Given timezones I thought
> it might be worthwhile getting them out there even if I don't have a
> patch to offer.

Of course :)

> It appears that only the first set of timings calculated by
> marvell_nfc_setup_interface() are used. This is because
> marvell_nfc_select_target() returns early if we are addressing the same
> chip. So even when we take the SDR timings into account we don't make
> full use of them in NDTR0/1.

The logic in the core has changed in the past, it is possible
that we did not catch a corner case in this driver.

Maybe this would cleanly solve the problem (there is similar operation
somewhere else in the driver):

--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -2457,6 +2457,9 @@ static int marvell_nfc_setup_interface(struct nand_chip *chip, int chipnr,
NDTR1_WAIT_MODE;
}

+ /* Ensure the next *_select_target() call will write the timing registers */
+ nfc->selected_chip = NULL;
+
return 0;
}

> The original problem I reported was from a customized kernel which
> included a change to write out the NDTR0/1 registers at the end of
> marvell_nfc_setup_interface(). So I can make my problem go away by
> removing the writes to NDTR0/1 but then instead of being too fast they
> are now way too slow. That'd probably keep the HW engineers happy but it
> feels a bit wrong.

Yeah, not the right approach.

Thanks,
Miquèl