Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
From: Neeraj Upadhyay
Date: Thu Jul 25 2024 - 11:36:09 EST
On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> It was discovered that isolated CPUs could sometimes be disturbed by
> kworkers processing kfree_rcu() works causing higher than expected
> latency. It is because the RCU core uses "system_wq" which doesn't have
> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> latency limits by using "system_unbound_wq" in the RCU core instead.
> This will ensure that those work items will not be run on CPUs marked
> as isolated.
>
Alternative approach here could be, in case we want to keep per CPU worker
pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
is fixing?
- Neeraj
> Beside the WQ_UNBOUND flag, the other major difference between system_wq
> and system_unbound_wq is their max_active count. The system_unbound_wq
> has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
> is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
>
> Reported-by: Vratislav Bendel <vbendel@xxxxxxxxxx>
> Closes: https://issues.redhat.com/browse/RHEL-50220
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> ---
> kernel/rcu/tasks.h | 4 ++--
> kernel/rcu/tree.c | 8 ++++----
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index e641cc681901..494aa9513d0b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3539,10 +3539,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
> if (delayed_work_pending(&krcp->monitor_work)) {
> delay_left = krcp->monitor_work.timer.expires - jiffies;
> if (delay < delay_left)
> - mod_delayed_work(system_wq, &krcp->monitor_work, delay);
> + mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
> return;
> }
> - queue_delayed_work(system_wq, &krcp->monitor_work, delay);
> + queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
> }
>
> static void
> @@ -3634,7 +3634,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
> // be that the work is in the pending state when
> // channels have been detached following by each
> // other.
> - queue_rcu_work(system_wq, &krwp->rcu_work);
> + queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
> }
> }
>
> @@ -3704,7 +3704,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
> if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
> !atomic_xchg(&krcp->work_in_progress, 1)) {
> if (atomic_read(&krcp->backoff_page_cache_fill)) {
> - queue_delayed_work(system_wq,
> + queue_delayed_work(system_unbound_wq,
> &krcp->page_cache_work,
> msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
> } else {
> --
> 2.43.5
>