Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs

From: Waiman Long
Date: Thu Jul 25 2024 - 13:02:20 EST


On 7/25/24 11:35, Neeraj Upadhyay wrote:
On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
It was discovered that isolated CPUs could sometimes be disturbed by
kworkers processing kfree_rcu() works causing higher than expected
latency. It is because the RCU core uses "system_wq" which doesn't have
the WQ_UNBOUND flag to handle all its work items. Fix this violation of
latency limits by using "system_unbound_wq" in the RCU core instead.
This will ensure that those work items will not be run on CPUs marked
as isolated.

Alternative approach here could be, in case we want to keep per CPU worker
pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
is fixing?

What exactly will we gain by defining a WQ_CPU_INTENSIVE workqueue? Or what will we lose by using system_unbound_wq? All the calls that are modified to use system_unbound_wq are using WORK_CPU_UNBOUND as their cpu. IOW, they doesn't care which CPUs are used to run the work items. The only downside I can see is the possible loss of some cache locality.

In fact, WQ_CPU_INTENSIVE can be considered a subset of WQ_UNBOUND. An WQ_UNBOUND workqueue will avoid using isolated CPUs, but not a WQ_CPU_INTENSIVE workqueue.

Cheers,
Longman