Re: [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues

From: Lai Jiangshan
Date: Wed Dec 20 2023 - 04:20:40 EST


On Wed, Dec 20, 2023 at 3:25 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
>
> Hello,
>
> A pool_workqueue (pwq) represents the connection between a workqueue and a
> worker_pool. One of the roles that a pwq plays is enforcement of the
> max_active concurrency limit. Before 636b927eba5b ("workqueue: Make unbound
> workqueues to use per-cpu pool_workqueues"), there was one pwq per each CPU
> for per-cpu workqueues and per each NUMA node for unbound workqueues, which
> was a natural result of per-cpu workqueues being served by per-cpu pools and
> unbound by per-NUMA pools.
>
> In terms of max_active enforcement, this was, while not perfect, workable.
> For per-cpu workqueues, it was fine. For unbound, it wasn't great in that
> NUMA machines would get max_active that's multiplied by the number of nodes
> but didn't cause huge problems because NUMA machines are relatively rare and
> the node count is usually pretty low.
>

Hello, Tejun

The patchset seems complicated to me. For me, reverting a bit to the behavior
of 636b927eba5b ("workqueue: Make unbound workqueues to use per-cpu
pool_workqueues"), like the following code (untested, just for showing
the idea),
seems simpler.

max_active will have the same behavior as before if the wq is configured
with WQ_AFFN_NUMA. For WQ_AFFN_{CPU|SMT|CACHE}, the problem
isn't fixed.

Thanks
Lai

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..d0f133f1441b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4364,8 +4364,16 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
ctx->dfl_pwq->refcnt++;
ctx->pwq_tbl[cpu] = ctx->dfl_pwq;
} else {
+ int head_cpu;
+
wq_calc_pod_cpumask(new_attrs, cpu, -1);
- ctx->pwq_tbl[cpu] = alloc_unbound_pwq(wq, new_attrs);
+ head_cpu = cpumask_first(new_attrs->__pod_cpumask);
+ if (!ctx->pwq_tbl[head_cpu]) {
+ ctx->pwq_tbl[cpu] =
alloc_unbound_pwq(wq, new_attrs);
+ } else {
+ ctx->pwq_tbl[head_cpu]->refcnt++;
+ ctx->pwq_tbl[cpu] = ctx->pwq_tbl[head_cpu];
+ }
if (!ctx->pwq_tbl[cpu])
goto out_free;
}