Re: [PATCH] io_uring: One wqe per wq

From: Jens Axboe
Date: Sat Mar 11 2023 - 17:14:03 EST


On 3/11/23 1:56 PM, Pavel Begunkov wrote:
> On 3/10/23 20:38, Jens Axboe wrote:
>> On 3/10/23 1:11 PM, Breno Leitao wrote:
>>> Right now io_wq allocates one io_wqe per NUMA node.  As io_wq is now
>>> bound to a task, the task basically uses only the NUMA local io_wqe, and
>>> almost never changes NUMA nodes, thus, the other wqes are mostly
>>> unused.
>>
>> What if the task gets migrated to a different node? Unless the task
>> is pinned to a node/cpumask that is local to that node, it will move
>> around freely.
>
> In which case we're screwed anyway and not only for the slow io-wq
> path but also with the hot path as rings and all io_uring ctx and
> requests won't be migrated locally.

Oh agree, not saying it's ideal, but it can happen.

What if you deliberately use io-wq to offload work and you set it
to another mask? That one I supposed we could handle by allocating
based on the set mask. Two nodes might be more difficult...

For most things this won't really matter as io-wq is a slow path
for that, but there might very well be cases that deliberately
offload.

> It's also curious whether io-wq workers will get migrated
> automatically as they are a part of the thread group.

They certainly will, unless affinitized otherwise.

--
Jens Axboe