Re: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable

From: Pavel Begunkov
Date: Fri Jan 31 2025 - 09:06:09 EST


On 1/30/25 14:57, Jens Axboe wrote:
On 1/29/25 10:36 PM, Max Kellermann wrote:
On Thu, Jan 30, 2025 at 12:41?AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote:
Ok, then it's an architectural problem and needs more serious
reengineering, e.g. of how work items are stored and grabbed

Rough unpolished idea: I was thinking about having multiple work
lists, each with its own spinlock (separate cache line), and each
io-wq thread only uses one of them, while the submitter round-robins
through the lists.

Pending work would certainly need better spreading than just the two
classes we have now.

One thing to keep in mind is that the design of io-wq is such that it's
quite possible to have N work items pending and just a single thread
serving all of them. If the io-wq thread doesn't go to sleep, it will
keep processing work units. This is done for efficiency reasons, and to

Looking at people complaining about too many iowq tasks, we should be
limiting the number of them even more aggressively, and maybe scaling
them down faster if that's a problem.

avoid a proliferation of io-wq threads when it's not going to be
beneficial. This means than when you queue a work item, it's not easy to
pick an appropriate io-wq thread upfront, and generally the io-wq thread
itself will pick its next work item at the perfect time - when it
doesn't have anything else to do, or finished the existing work.

This should be kept in mind for making io-wq scale better.

People are saying that work stealing is working well with thread
pools, that might be an option, even though there are some
differences from userspace thread pools. I also remember Hao was
trying to do something for iowq a couple of years ago.

--
Pavel Begunkov