Re: [PATCH 3/8] workqueue: Implement BH workqueues to eventually replace tasklets

From: Tejun Heo
Date: Thu Feb 01 2024 - 16:48:29 EST


Hello, Lai.

On Thu, Feb 01, 2024 at 07:02:27PM +0800, Lai Jiangshan wrote:
> On Tue, Jan 30, 2024 at 5:16 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
> > @@ -1184,6 +1211,14 @@ static bool kick_pool(struct worker_pool *pool)
> > if (!need_more_worker(pool) || !worker)
> > return false;
> >
> > + if (pool->flags & POOL_BH) {
> > + if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
> > + tasklet_hi_schedule(&worker->bh_tasklet);
> > + else
> > + tasklet_schedule(&worker->bh_tasklet);
> > + return true;
> > + }
>
> I think it is more straightforward to call bh_worker_taskletfn[_hi]()
> in tasklet_action() and tasklet_hi_action() rather than add a
> worker->bh_tasklet.
>
> raise_softirq_irqoff() can be used here (kick_pool()) instead.
>
> As the changelog said, once the conversion is complete, tasklet can be
> removed and BH workqueues can directly take over the tasklet softirqs,
> in which case, then, bh_worker_taskletfn[_hi]() can directly take over
> the tasklet_action() and tasklet_hi_action().

Hmmm.... maybe. Yeah, that'd also make it a tiny bit cheaper for hot paths.
Lemme see how that looks.

> I think wq->max_active can be forced to be UINT_MAX or ULONG_MAX
> in the max_active management code to avoid a branch here.

Good point. Will update.

> worker_attach_to_pool() and worker_detach_from_pool also access to
> worker->task with kthread_set_per_cpu() and luckily to_kthread()
> checks the NULL pointer for it.
>
> IMO, it is better to avoid calling kthread_set_per_cpu() for bh workers.

Note that BH worker pools are always DISASSOCIATED, so
worker_attach_to_pool() shouldn't call kthread_set_per_cpu(). Also, BH
workers and worker pools are never destroyed, so worker_detach_from_pool()
shouldn't be called either. I'll add WARN_ONs to clarify.

> > @@ -5605,7 +5731,12 @@ static void pr_cont_pool_info(struct worker_pool *pool)
> > pr_cont(" cpus=%*pbl", nr_cpumask_bits, pool->attrs->cpumask);
> > if (pool->node != NUMA_NO_NODE)
> > pr_cont(" node=%d", pool->node);
> > - pr_cont(" flags=0x%x nice=%d", pool->flags, pool->attrs->nice);
> > + pr_cont(" flags=0x%x", pool->flags);
> > + if (pool->flags & POOL_BH)
> > + pr_cont(" bh%s",
> > + pool->attrs->nice == HIGHPRI_NICE_LEVEL ? "-hi" : "");
> > + else
> > + pr_cont(" nice=%d", pool->attrs->nice);
>
> There are also some "worker->task" in show_pwq(), show_one_worker_pool(),
> and show_cpu_pool_hog() needing taking care of.

Ah, right, I'll hunt them down.

Thanks.

--
tejun