Re: [bug report] block: Non-NCQ commands will never be executed while fio is continuously running

From: Niklas Cassel
Date: Thu Oct 31 2024 - 10:13:04 EST


On Thu, Sep 19, 2024 at 04:14:15PM +0200, Damien Le Moal wrote:
> On 2024/09/19 14:26, Yu Kuai wrote:
> >
> > Does libata return a specific value in this case? If so, maybe we can
> > stop other hctx untill this IO is handled.
> >
> > For now, I think libata should use single hctx, it just doesn't support
> > multiple hctx yet.
>
> libata does not care/know about hctx. It only issues commands to ATA devices,
> which always are single queue. And pure SATA adapters like AHCI are always
> single queue.
>
> The issue at hand can happen only for libsas based SAS HBAs that have multiple
> command submission queues (with a shared tag set). Commands for the same device
> may end up being submitted through different queues, and when the submitted
> commands include a mix of NCQ and non-NCQ commands, the problem happens without
> libata being able to easily do anything about it, and not possible control
> possible at the scsi layer either since the commands submitted are SCSI (not yet
> translated to ATA commands) which do not have any NCQ/non-NCQ exclusion
> knowledge at all. NCQ is an ATA concept unknown to the scsi and block layer.
>
> We (Niklas and I) are trying to find a solution, but that may not be within
> libata itself. It may need changes to libsas as well. Not sure yet. Still exploring.

Hello Xingui,

I send a proposed solution to this problem here:
https://lore.kernel.org/linux-ide/20241031140731.224589-4-cassel@xxxxxxxxxx/

Please test and see if it addresses your problem.


Kind regards,
Niklas