Re: [BUG] RCU hang with io_uring nvme polling

From: Keith Busch

Date: Mon Jun 29 2026 - 18:04:34 EST


On Mon, Jun 29, 2026 at 03:40:52PM -0600, Keith Busch wrote:
> On Mon, Jun 29, 2026 at 04:47:00PM -0400, Ben Carey wrote:
>
> > Putting
> > io_check_iopoll behind a spinlock seems to fix it, though I imagine a more
> > elegant fix is out there (reusing a different lock, not using expensive locks,
> > a smarter place to check for racing, etc.)
>
> I can see why that resolves your observation, but I don't think we can
> do this. We're ultimately polling for a hardware event, and this layer
> is too high a level for serializing these things.

It's also worse than that; your proposal serializes within an
io_uring_ctx, so two completely different applications could have the
exact same problem you discovered.

I don't necessarily like the accepted solution as it is time bound on
jiffies for an idle device, which is an eternity for low-latency
storage, but what else can we do? It's too expensive to check for a
specific IO or idle on each polling iteration. I guess we're expecting a
hi-pri application is constantly feeding the queue such that this is a
non-issue.