Re: [iwl-next PATCH v4 2/3] idpf: convert workqueues to unbound

From: Brian Vazquez
Date: Mon Dec 16 2024 - 15:16:38 EST


On Mon, Dec 16, 2024 at 1:11 PM Alexander Lobakin
<aleksander.lobakin@xxxxxxxxx> wrote:
>
> From: Brian Vazquez <brianvv@xxxxxxxxxx>
> Date: Mon, 16 Dec 2024 16:27:34 +0000
>
> > From: Marco Leogrande <leogrande@xxxxxxxxxx>
> >
> > When a workqueue is created with `WQ_UNBOUND`, its work items are
> > served by special worker-pools, whose host workers are not bound to
> > any specific CPU. In the default configuration (i.e. when
> > `queue_delayed_work` and friends do not specify which CPU to run the
> > work item on), `WQ_UNBOUND` allows the work item to be executed on any
> > CPU in the same node of the CPU it was enqueued on. While this
> > solution potentially sacrifices locality, it avoids contention with
> > other processes that might dominate the CPU time of the processor the
> > work item was scheduled on.
> >
> > This is not just a theoretical problem: in a particular scenario
> > misconfigured process was hogging most of the time from CPU0, leaving
> > less than 0.5% of its CPU time to the kworker. The IDPF workqueues
> > that were using the kworker on CPU0 suffered large completion delays
> > as a result, causing performance degradation, timeouts and eventual
> > system crash.
>
> Wasn't this inspired by [0]?
>
> [0]
> https://lore.kernel.org/netdev/20241126035849.6441-11-milena.olech@xxxxxxxxx

The root cause is exactly the same so I do see the similarity and I'm
not surprised that both were addressed with a similar patch, we hit
this problem some time ago and the first attempt to have this was in
August [0].

[0]
https://lore.kernel.org/netdev/20240813182747.1770032-4-manojvishy@xxxxxxxxxx/

>
> Thanks,
> Olek