Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs

From: Waiman Long
Date: Sun Jul 28 2024 - 23:07:06 EST


On 7/24/24 09:30, Breno Leitao wrote:
On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
It was discovered that isolated CPUs could sometimes be disturbed by
kworkers processing kfree_rcu() works causing higher than expected
latency. It is because the RCU core uses "system_wq" which doesn't have
the WQ_UNBOUND flag to handle all its work items. Fix this violation of
latency limits by using "system_unbound_wq" in the RCU core instead.
This will ensure that those work items will not be run on CPUs marked
as isolated.

Beside the WQ_UNBOUND flag, the other major difference between system_wq
and system_unbound_wq is their max_active count. The system_unbound_wq
has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.

Reported-by: Vratislav Bendel <vbendel@xxxxxxxxxx>
I've seen this problem a while ago and reported to the list:

https://lore.kernel.org/all/Zp906X7VJGNKl5fW@xxxxxxxxx/

I've just applied this test, and run my workload for 2 hours without
hitting this issue. Thanks for solving it.

Tested-by: Breno Leitao <leitao@xxxxxxxxxx>

Thank for testing this patch. So it is just us that saw this problem.

Cheers,
Longman