Re: [PATCH v5 0/3] spi: tegra210-quad: Improve timeout handling under high system load

From: Jon Hunter

Date: Thu Nov 06 2025 - 05:06:41 EST



On 28/10/2025 15:57, Vishwaroop A wrote:
Hi,

This patch series addresses timeout handling issues in the Tegra QSPI driver
that occur under high system load conditions. We've observed that when CPUs
are saturated (due to error injection, RAS firmware activity, or general CPU
contention), QSPI interrupt handlers can be delayed, causing spurious transfer
failures even though the hardware completed the operation successfully.

Patch 1 fixes a stale pointer issue by ensuring curr_xfer is cleared on timeout
and checked when the IRQ thread finally runs. It also ensures interrupts are
properly cleared on failure paths.

Patch 2 refactors the timeout cleanup code into dedicated helper functions
(tegra_qspi_reset, tegra_qspi_dma_stop, tegra_qspi_pio_stop) to improve code
readability and maintainability. This is purely a code reorganization with no
functional changes.

Patch 3 adds hardware status checking on timeout. Before failing a transfer,
the driver now reads QSPI_TRANS_STATUS to verify if the hardware actually
completed the operation. If so, it manually invokes the completion handler
instead of failing the transfer. This distinguishes genuine hardware timeouts
from delayed/lost interrupts.

These changes have been tested in production environments under various high
load scenarios including RAS testing and CPU saturation workloads.


For the series ...

Tested-by: Jon Hunter <jonathanh@xxxxxxxxxx>
Reviewed-by: Jon Hunter <jonathanh@xxxxxxxxxx>

Thanks
Jon

--
nvpublic