Re: [RFC PATCH 01/10] spi: spi-mem: Introduce support for tuning controller

From: Santhosh Kumar K

Date: Wed Dec 10 2025 - 06:35:06 EST

Hello Amit,

On 04/12/25 22:24, Mahapatra, Amit Kumar wrote:

[AMD Official Use Only - AMD Internal Distribution Only]

Hello Santosh,

-----Original Message-----
From: Santhosh Kumar K <s-k6@xxxxxx>
Sent: Wednesday, December 3, 2025 1:32 PM
To: Pratyush Yadav <pratyush@xxxxxxxxxx>
Cc: Miquel Raynal <miquel.raynal@xxxxxxxxxxx>; richard@xxxxxx; vigneshr@xxxxxx;
broonie@xxxxxxxxxx; tudor.ambarus@xxxxxxxxxx; mwalle@xxxxxxxxxx; p-
mantena@xxxxxx; linux-spi@xxxxxxxxxxxxxxx; linux-mtd@xxxxxxxxxxxxxxxxxxx; linux-
kernel@xxxxxxxxxxxxxxx; a-dutta@xxxxxx; u-kumar1@xxxxxx; praneeth@xxxxxx; s-
k6@xxxxxx
Subject: Re: [RFC PATCH 01/10] spi: spi-mem: Introduce support for tuning
controller

Hello Pratyush,

On 18/11/25 19:19, Pratyush Yadav wrote:

On Sat, Sep 20 2025, Santhosh Kumar K wrote:

[...]

This is actually wrong. Tuning is way more generic than that :) If
someone wants to use a chip at a high frequency (50MHz in your case,
but whatever, there is a threshold above which additional care must
be taken), it must go through the calibration step. It does not
matter in which mode you are. Calibration would still be relevant in
single SDR mode.
This 50MHz bothered Mark because it is too Cadence specific. Maybe
this should be a controller parameter? If the spi-mem core (or even
the spi core, by extensino) sees that the design allows running at
XMHz (due to the SPI peripheral properties or simply the absence of
any limitation), and if the controller states that it requires an
extra tuning step above YMHz (and X > Y), then it launches the calibration.
From a core perspective, I would like the calibration hook to be
as simple as possible, because what "calibration" means is highly
controller and chip specific.

I understand the concern here.

Let me point out the options for launching the tuning procedure,
along with the issues in each approach.

Option 1: Launch tuning as part of spi_mem_exec_op()
- After spi_mem_access_start(), introduce a
spi_mem_needs_tuning() check (a new callback to SPI MEM controller)
to check whether the current op requires tuning
- If yes, we call spi_mem_execute_tuning()
- on success, mark tuning complete in a flag within SPI MEM
Controller private data
- on failure, we attempt a fallback by calling
spi_mem_adjust_op_freq() and drop to a lower supported frequency

Option 2: Launch tuning within spi_controller->exec_op() implementation
- Very similar to option 1, except that the
spi_mem_execute_tuning() is triggered from within the controller's
exec_op() implementation (no need for spi_mem_needs_tuning())

Drawbacks in option 1 and 2:
- Tuning requires multiple reads of a known pattern, but the
flash may not always be in a state to allow read commands
- No fallback on failures, can't make flash-specific adjustments
in case of a tuning failure
- No access to write_op() to write known pattern temporarily to
an on-die cache. Pattern needs to be always burnt into the flash

- Plus, in option 2 - we can't call spi_mem_adjust_op_freq()

While the need for tuning is dictated by Controller specific
characteristics the ops (and state of the chip) required to complete
tuning is under the control of spi-mem users (spi-nand/spi-nor).
So, it's impossible to achieve tuning without the help of spi-mem users.

So, Option 3: Launch from SPI MEM clients
(mtd/nand/spi or mtd/spi-nor, etc.,)
- Once the spi-mem chip is completely enumerated and best read
and write ops are chosen call spi_mem_needs_tuning(read_op, write_op)
as a part of .probe()
- If tuning is required, call
spi_mem_execute_tuning(read_op, write_op)
- If only read_op is provided, it implies the tuning pattern
is pre-flashed to the partition
- On tuning failure, retry by re-running spi_mem_needs_tuning()
with the second best set of ops (max throughput - 1)

With option 3, spi_mem users are limited to calling
spi_mem_needs_tuning() and spi_mem_execute_tuning(). Rest is hidden
within the controller drivers. If spi-mem users change read/write
ops, the above sequence can be re-issued.

The controller can store the read_op and write_op in case of a tuning
success and periodically re-run tuning, ensuring we always have valid
tuning parameters.

One concern with option 3 is that we may not be able to make use of
static data on certain flash as tuning patterns (like reading
parameter page or SFDP table for tuning instead of controller
specific attack patterns).

Why not? How else would tuning work? Do you expect controllers to
first flash the tuning pattern and then tune the reads? That is a hard
no I think, since you don't want to over-write user data and I don't
think we will ever have any area of memory we can reliably over-write
without risking that.

I think we should start with the requirement to have the pattern
flashed already and figure out how SPI NOR or SPI NAND can discover
that (perhaps via NVMEM?).

I agree - having the controller overwrite user data is a hard no!

For SPI NAND, a program operation happens in two steps: data is first copied into
the internal cache, and only then written to the flash during the program-execute
phase. This is why the tuning flow writes the pattern only to the device's internal
cache and reads it back it from there. This avoids touching any user data on the
flash and is already implemented in v2 (which I'll post shortly).

For SPI NOR, we do not have an equivalent "write-to-cache" possible, so we still
require a pre-flashed pattern region. At the moment this is provided via a dedicated
"phypattern" partition, and its offset is obtained through the of_get_* APIs.

I was wondering, for SPI-NOR devices, why can’t we use the "phypattern"
partition to write the pattern and then read it back during tuning? Since the user
would need to define a specific partition name (i.e., phypattern) to initiate tuning,
that partition could also be reserved for tuning purposes. Please let me know
your thoughts on this.

Yes, the 'phypattern' partition is used as the source for the tuning
pattern on SPI-NOR devices, but the pattern itself is pre-flashed. We
avoid having the SPI-NOR core write the data, since the driver should
neither need to know the pattern contents nor modify any persistent
flash region during tuning. Keeping the pattern pre-flashed ensures the
tuning sequence is read-only from the driver's perspective and prevents
any risk of touching user or system data.

Regards,
Santhosh.

Regards,
Amit

Regarding ways to locate the partition:

1. Using NVMEM:
a. Exposing the phypattern partition as an NVMEM cell and issuing an
NVMEM read during tuning does not work reliably, because NVMEM
ends up calling into the MTD read path and we cannot control which
read_op variant is used for the read.

b. Advertising the partition as an NVMEM cell and using NVMEM only
to fetch the offset is not possible either. NVMEM abstracts the
private data, including partition offsets, so we can't retrieve
the offset as well.

2. Using of_get_* APIs:
Using the standard OF helpers to locate the phypattern partition
and retrieve its offset is both reliable and straighforward, and
is the approach currently implemented in v2.

I think SFDP is quite nice for this, but IIRC for
spi-candence-quadspi, that was not a viable option due to some
reasons. If you can now make it work with SFDP, then that would be
even better, since we don't have to deal with the pain of pre-flashing.

The current tuning flow requires a specific stress pattern to ensure robustness, and
the SFDP data aren't good enough for it.

Overall, I think option 3 is the most promising. Options 1 and 2 will
likely add so much overhead they will end up being slower than non-PHY
reads, since tuning is usually quite expensive.

Thanks,
Santhosh.

Please let me know your thoughts on which of these directions makes
the most sense.

Thanks,
Santhosh.