Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs

From: Neeraj Upadhyay
Date: Thu Jul 25 2024 - 15:33:18 EST


On Thu, Jul 25, 2024 at 01:02:01PM -0400, Waiman Long wrote:
> On 7/25/24 11:35, Neeraj Upadhyay wrote:
> > On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> > > It was discovered that isolated CPUs could sometimes be disturbed by
> > > kworkers processing kfree_rcu() works causing higher than expected
> > > latency. It is because the RCU core uses "system_wq" which doesn't have
> > > the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> > > latency limits by using "system_unbound_wq" in the RCU core instead.
> > > This will ensure that those work items will not be run on CPUs marked
> > > as isolated.
> > >
> > Alternative approach here could be, in case we want to keep per CPU worker
> > pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
> > WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
> > is fixing?
>
> What exactly will we gain by defining a WQ_CPU_INTENSIVE workqueue? Or what
> will we lose by using system_unbound_wq? All the calls that are modified to
> use system_unbound_wq are using WORK_CPU_UNBOUND as their cpu. IOW, they
> doesn't care which CPUs are used to run the work items. The only downside I
> can see is the possible loss of some cache locality.
>

For the nohz_full case, where unbounded pool workers run only on housekeeping CPU
(cpu0), if multiple other CPUs are queuing work, the execution of those
works could get delayed. However, this should not generally happen as
other CPUs would be mostly running in user mode.


> In fact, WQ_CPU_INTENSIVE can be considered a subset of WQ_UNBOUND. An
> WQ_UNBOUND workqueue will avoid using isolated CPUs, but not a
> WQ_CPU_INTENSIVE workqueue.

Got it, thanks!

I have picked the patch for further review and testing [1]


[1] https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/log/?h=next


- Neeraj

>
> Cheers,
> Longman
>
>