Re: [PATCH] ata: libata-scsi: fix requeue of deferred ATA PASS-THROUGH commands

From: Niklas Cassel

Date: Sun Apr 12 2026 - 06:42:58 EST


On Fri, Apr 10, 2026 at 04:15:19PM -0700, Igor Pylypiv wrote:
> Commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
> introduced ata_scsi_requeue_deferred_qc() to handle commands deferred
> during resets or NCQ failures. This deferral logic completed commands
> with DID_SOFT_ERROR to trigger a retry in the SCSI mid-layer.
>
> However, DID_SOFT_ERROR is subject to scsi_cmd_retry_allowed() checks.
> ATA PASS-THROUGH commands sent via SG_IO ioctl have scmd->allowed set
> to zero. This causes the mid-layer to fail the command immediately
> instead of retrying, even though the command was never actually issued
> to the hardware.
>
> Switch to DID_REQUEUE to ensure these commands are inserted back into
> the request queue regardless of retry limits.
>
> Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
> Signed-off-by: Igor Pylypiv <ipylypiv@xxxxxxxxxx>
> ---
> drivers/ata/libata-scsi.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 3b65df914ebb..0236394900cc 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1692,7 +1692,7 @@ void ata_scsi_requeue_deferred_qc(struct ata_port *ap)
> /*
> * If we have a deferred qc when a reset occurs or NCQ commands fail,
> * do not try to be smart about what to do with this deferred command
> - * and simply retry it by completing it with DID_SOFT_ERROR.
> + * and simply requeue it by completing it with DID_REQUEUE.
> */
> if (!qc)
> return;
> @@ -1701,7 +1701,7 @@ void ata_scsi_requeue_deferred_qc(struct ata_port *ap)
> ap->deferred_qc = NULL;
> cancel_work(&ap->deferred_qc_work);
> ata_qc_free(qc);
> - scmd->result = (DID_SOFT_ERROR << 16);
> + set_host_byte(scmd, DID_REQUEUE);

set_host_byte() will set the host byte, but it will keep the status byte
and the ML byte intact.

By using the assignment operator, I assumed that Damien intentionally
wanted to clear the status byte and the ML byte.

My point is that using set_host_byte() is a logical change.
If we want to stop clearing the status byte and the ML byte, then I think
that change should be in a separate commit, with a proper motivation/commit
message.

However, for the fix patch itself, I think we should just do:
- scmd->result = (DID_SOFT_ERROR << 16);
+ scmd->result = (DID_REQUEUE << 16);


If that is sufficient to fix your observed problem.

I would also be happy to see a follow up patch that changes to use
set_host_byte(), if there is a motivation that can motivate why that change
is safe/valid.


Kind regards,
Niklas