Re: [BUG] RCU hang with io_uring nvme polling

From: Jens Axboe

Date: Fri Jun 26 2026 - 12:40:15 EST


On 6/26/26 10:33 AM, Keith Busch wrote:
> On Fri, Jun 26, 2026 at 10:06:49AM -0600, Jens Axboe wrote:
>> Ah good catch, I missed that. Should've grepped! In general, IO should
>> either get polled, or if the device is misbehaving, then timeouts will
>> catch it. That said, haven't looked at the actual report yet, will do
>> so next week (unless you beat me to it...?)
>
> I'll give it a shot!
>
> The test has 1 polling queue with 2 jobs dispatching. One of the job's
> polled the completions for both. The other job is polling for no reason
> at all with nothing outstanding. The only thing that can break us out of
> that loop now is need_resched(), but that appears to never return true.

Yes, it's a bad configuration. I bet it's as simple as:

https://lore.kernel.org/linux-block/20260617155051.1266079-1-anuj20.g@xxxxxxxxxxx/

but in practice nobody should configure a single poll queue and run
multiple jobs, particularly not when the object is framed around "energy
efficiency" as this configuration is pretty much guaranteed to waste 2
cores, with most of the time going towards spinning on a lock rather
than doing potentially useful work.

--
Jens Axboe