Re: [RFC PATCH 01/10] spi: spi-mem: Introduce support for tuning controller
From: Santhosh Kumar K
Date: Wed Dec 03 2025 - 03:02:55 EST
Hello Miquel and Pratyush,
On 18/11/25 19:12, Pratyush Yadav wrote:
On Wed, Nov 05 2025, Miquel Raynal wrote:
Hello Santhosh,
- On tuning failure, retry by re-running spi_mem_needs_tuning() withI would like to challenge this need. Can the same calibration fail if
the second best set of ops (max throughput - 1)
attempted multiple times (eg. because of the heat?) If yes, then we need
a fallback indeed. Otherwise, I'd be in favor of just failing the
probe. Calibration is an opt-in -> users must allow a higher frequency
than they use to in order to enable the feature?
It's possible the same calibration will fail intermittently for
different reasons (temperature changes, as you mentioned). If tuning
fails, the driver should fallback to the non-PHY frequency so the flash
continues operating with slower reads/writes rather than failing the
probe (availability should be prioritized, right?).
Agreed, if the tuning may fail we must fallback in this case. However
there is another situation that must be handled in this case: once
tuning is done and we want to use PHY-optimized paths, we must fallback
to more basic/slower reads if for some external reason, they start
failing, right?
How would you even detect that your tuning is out-of-date because of
temperature changes? You would need some sort of on-flash ECC to detect
that. I think many of the flashes that support DDR reads at high
frequencies also have ECC, but AFAIK the SPI NOR core does not support
it.
Anyway, I think we should limit the scope of the problem. Let's first
start with the expectation that the tuning supports the whole operation
range of the device. This was true at least for the spi-cadence-quadspi
tuning that I worked on when I was at TI. The tuning parameters had
enough margin to ensure it worked for the device's whole temperature
range.
If there is a tuning algorithm that can't do that, then we can extend
the core to either do ECC or perhaps let temperature sensors signal the
need for re-calibration.
But for now I think it is easiest to just ignore the problem and focus
on the other ones like how to get the calibration pattern and how to do
the tuning.
The obvious choice in this case would be to let this error handling to
the controller driver. Re-using the same operation at a lower speed
would be suboptimal, because the fastest operation at a high speed might
not be the most efficient at slower speeds due to the number of dummy
cycles needed,. But I believe this is negligible based on the fact that
we already are in degraded mode at that stage.
However, this may conflict with:
- read retries
- continuous reads (?)
So in practice the fallback might be needed on the SPI NAND/NOR side
(this can be further discussed).
Just to summarize, fallback logic during probe:
- If the controller reports a tuning failure, the spi-mem client may
either retry tuning with the next-best (max-1) operation or fallback to
the non-PHY, slower operation and adjust the dummy cycles accordingly to
use the optimal non-PHY variant.
And yes, for now the priority is to have a robust probe-time tuning flow
before addressing any runtime tuning concerns.
But once we solve this, comes a similar problem on the write side. How
do we know if a write will or did fail because of a temperature change?
What may be the heuristics to fallback in this case?
Santhosh, do you have any numbers on write performance improvements? I
am curious if it is even worth the effort.
There's no real performance gain for SPI NOR, but SPI NAND shows notable
improvement wrt. page size.
Write performance numbers from AM62A SK with W35N01JW OSPI NAND:
- without PHY: 6 MB/s
- with PHY: 9.2 MB/s
Thanks,
Santhosh.
Thanks,
Miquèl