Re: [RFC PATCH 01/10] spi: spi-mem: Introduce support for tuning controller

From: Santhosh Kumar K

Date: Wed Dec 03 2025 - 03:02:40 EST

Hello Pratyush,

On 18/11/25 19:19, Pratyush Yadav wrote:

On Sat, Sep 20 2025, Santhosh Kumar K wrote:

[...]

This is actually wrong. Tuning is way more generic than that :)
If someone wants to use a chip at a high frequency (50MHz in your case,
but whatever, there is a threshold above which additional care must be
taken), it must go through the calibration step. It does not matter in
which mode you are. Calibration would still be relevant in single SDR
mode.
This 50MHz bothered Mark because it is too Cadence specific. Maybe this
should be a controller parameter? If the spi-mem core (or even the spi
core, by extensino) sees that the design allows running at XMHz (due to
the SPI peripheral properties or simply the absence of any limitation),
and if the controller states that it requires an extra tuning step above
YMHz (and X > Y), then it launches the calibration.
From a core perspective, I would like the calibration hook to be as
simple as possible, because what "calibration" means is highly
controller and chip specific.

I understand the concern here.

Let me point out the options for launching the tuning procedure, along
with the issues in each approach.

Option 1: Launch tuning as part of spi_mem_exec_op()
- After spi_mem_access_start(), introduce a spi_mem_needs_tuning()
check (a new callback to SPI MEM controller) to check whether the
current op requires tuning
- If yes, we call spi_mem_execute_tuning()
- on success, mark tuning complete in a flag within SPI MEM
Controller private data
- on failure, we attempt a fallback by calling
spi_mem_adjust_op_freq() and drop to a lower supported frequency

Option 2: Launch tuning within spi_controller->exec_op() implementation
- Very similar to option 1, except that the spi_mem_execute_tuning()
is triggered from within the controller's exec_op() implementation
(no need for spi_mem_needs_tuning())

Drawbacks in option 1 and 2:
- Tuning requires multiple reads of a known pattern, but the flash
may not always be in a state to allow read commands
- No fallback on failures, can't make flash-specific adjustments in
case of a tuning failure
- No access to write_op() to write known pattern temporarily to an
on-die cache. Pattern needs to be always burnt into the flash

- Plus, in option 2 - we can't call spi_mem_adjust_op_freq()

While the need for tuning is dictated by Controller specific
characteristics the ops (and state of the chip) required to complete
tuning is under the control of spi-mem users (spi-nand/spi-nor).
So, it's impossible to achieve tuning without the help of spi-mem users.

So, Option 3: Launch from SPI MEM clients
(mtd/nand/spi or mtd/spi-nor, etc.,)
- Once the spi-mem chip is completely enumerated and best read and
write ops are chosen call spi_mem_needs_tuning(read_op, write_op) as
a part of .probe()
- If tuning is required, call
spi_mem_execute_tuning(read_op, write_op)
- If only read_op is provided, it implies the tuning pattern is
pre-flashed to the partition
- On tuning failure, retry by re-running spi_mem_needs_tuning() with
the second best set of ops (max throughput - 1)

With option 3, spi_mem users are limited to calling
spi_mem_needs_tuning() and spi_mem_execute_tuning(). Rest is hidden
within the controller drivers. If spi-mem users change read/write ops,
the above sequence can be re-issued.

The controller can store the read_op and write_op in case of a tuning
success and periodically re-run tuning, ensuring we always have valid
tuning parameters.

One concern with option 3 is that we may not be able to make use of
static data on certain flash as tuning patterns (like reading parameter
page or SFDP table for tuning instead of controller specific attack
patterns).

Why not? How else would tuning work? Do you expect controllers to first
flash the tuning pattern and then tune the reads? That is a hard no I
think, since you don't want to over-write user data and I don't think we
will ever have any area of memory we can reliably over-write without
risking that.

I think we should start with the requirement to have the pattern flashed
already and figure out how SPI NOR or SPI NAND can discover that
(perhaps via NVMEM?).

I agree - having the controller overwrite user data is a hard no!

For SPI NAND, a program operation happens in two steps: data is first
copied into the internal cache, and only then written to the flash
during the program-execute phase. This is why the tuning flow writes
the pattern only to the device's internal cache and reads it back it
from there. This avoids touching any user data on the flash and is
already implemented in v2 (which I'll post shortly).

For SPI NOR, we do not have an equivalent "write-to-cache" possible, so
we still require a pre-flashed pattern region. At the moment this is
provided via a dedicated "phypattern" partition, and its offset is
obtained through the of_get_* APIs.

Regarding ways to locate the partition:

1. Using NVMEM:
a. Exposing the phypattern partition as an NVMEM cell and issuing an
NVMEM read during tuning does not work reliably, because NVMEM
ends up calling into the MTD read path and we cannot control which
read_op variant is used for the read.

b. Advertising the partition as an NVMEM cell and using NVMEM only
to fetch the offset is not possible either. NVMEM abstracts the
private data, including partition offsets, so we can't retrieve
the offset as well.

2. Using of_get_* APIs:
Using the standard OF helpers to locate the phypattern partition
and retrieve its offset is both reliable and straighforward, and
is the approach currently implemented in v2.

I think SFDP is quite nice for this, but IIRC for spi-candence-quadspi,
that was not a viable option due to some reasons. If you can now make it
work with SFDP, then that would be even better, since we don't have to
deal with the pain of pre-flashing.

The current tuning flow requires a specific stress pattern to ensure
robustness, and the SFDP data aren't good enough for it.

Overall, I think option 3 is the most promising. Options 1 and 2 will
likely add so much overhead they will end up being slower than non-PHY
reads, since tuning is usually quite expensive.

Thanks,
Santhosh.

Please let me know your thoughts on which of these directions makes the
most sense.

Thanks,
Santhosh.