Re: [PATCH] iommu: Avoid softlockup and rcu stall in fq_flush_timeout().

From: John Garry
Date: Mon May 22 2023 - 11:36:34 EST


On 22/05/2023 16:18, Jerry Snitselaar wrote:
My guess is that the allocations are too big and not covered by the
allocation sizes supported by the flush-queue code. But maybe this is
something that can be fixed. Or the flush-queue code could even be
changed to auto-adapt to allocation patterns of the device driver?

Regards,

Joerg
In the case I know of it involved some proprietary test suites
(Hazard I/O, and Medusa?), and the lpfc driver. I was able to force
the condition using fio with a number of jobs running. I'll play
around and see if I can figure out a point where it starts to become
an issue.

I mentioned what the nvme driver did to the Broadcom folks for the max
dma size, but I haven't had a chance to go looking at it myself yet to
see if there is somewhere in the lpfc code to fix up.

JFYI, SCSI core already supports setting this in shost->opt_sectors, see example in sas_host_setup().

This issue may continue to pop up so we may need a better way to turn it on/off for all drivers or classes of drivers.

John