Re: [PATCH] workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND.
From: Tejun Heo
Date: Wed Feb 26 2025 - 11:44:23 EST
Hello, Frederic.
On Wed, Feb 26, 2025 at 04:02:19PM +0100, Frederic Weisbecker wrote:
...
> > That's API guarantee and there are plenty of users who depend on
> > queue_work() and schedule_work() on per-cpu workqueues to be actually
> > per-cpu. I don't think we can pull the rug from under them. If we want to do
> > this, which I think is a good idea, we should:
> >
> > 1. Convert per-cpu workqueue users to unbound workqueues. Most users don't
> > care whether work item is executed locally or not. However, historically,
> > we've been preferring per-cpu workqueues because unbound workqueues had a
> > lot worse locality properties. Unbound workqueue's topology awareness is
> > a lot better now, so this should be less of a problem and we should be
> > able to move a lot of users over to unbound workqueues.
>
> But we must check those ~1951 schedule_work() users one by one to make sure they
> don't rely on locality for correctness, right? :-)
Yes, no matter what we do, there is no way around that.
> > 2. There still are cases where local execution isn't required for
> > correctness but local & concurrency controlled executions yield
> > performance gains. Workqueue API currently doesn't distinguish these two
> > cases. We should add a new API which prefers local execution but doesn't
> > require it, which can then do what's suggested in this patch.
>
> That is much trickier to find out and requires to know about the subsystem
> details and history.
One good thing is that for workqueues that actually should be per-CPU for
performance, there usually are a group of people, often including the
mtaintainers, that would be familiar with the performance situation and pipe
up, so it's not *that* hopeless.
> For those that don't rely on locality for correctness, we would really like
> to be able to offload them to unbound pool at least when nohz_full= is filled.
> Because in that case we don't care much on workqueues performance.
Yeah, that makes sense to me.
Thanks.
--
tejun