On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
It was discovered that isolated CPUs could sometimes be disturbed byI've seen this problem a while ago and reported to the list:
kworkers processing kfree_rcu() works causing higher than expected
latency. It is because the RCU core uses "system_wq" which doesn't have
the WQ_UNBOUND flag to handle all its work items. Fix this violation of
latency limits by using "system_unbound_wq" in the RCU core instead.
This will ensure that those work items will not be run on CPUs marked
as isolated.
Beside the WQ_UNBOUND flag, the other major difference between system_wq
and system_unbound_wq is their max_active count. The system_unbound_wq
has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
Reported-by: Vratislav Bendel <vbendel@xxxxxxxxxx>
https://lore.kernel.org/all/Zp906X7VJGNKl5fW@xxxxxxxxx/
I've just applied this test, and run my workload for 2 hours without
hitting this issue. Thanks for solving it.
Tested-by: Breno Leitao <leitao@xxxxxxxxxx>