RE: [Intel-wired-lan] [iwl-next PATCH v4 2/3] idpf: convert workqueues to unbound

From: Singh, Krishneil K
Date: Tue Jan 14 2025 - 02:01:06 EST



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@xxxxxxxxxx> On Behalf Of
> Brian Vazquez
> Sent: Monday, December 16, 2024 12:13 PM
> To: Lobakin, Aleksander <aleksander.lobakin@xxxxxxxxx>
> Cc: Brian Vazquez <brianvv.kernel@xxxxxxxxx>; Nguyen, Anthony L
> <anthony.l.nguyen@xxxxxxxxx>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@xxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>;
> Eric Dumazet <edumazet@xxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>;
> Paolo Abeni <pabeni@xxxxxxxxxx>; intel-wired-lan@xxxxxxxxxxxxxxxx; David
> Decotigny <decot@xxxxxxxxxx>; Vivek Kumar <vivekmr@xxxxxxxxxx>;
> Singhai, Anjali <anjali.singhai@xxxxxxxxx>; Samudrala, Sridhar
> <sridhar.samudrala@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; Tantilov, Emil S <emil.s.tantilov@xxxxxxxxx>; Marco
> Leogrande <leogrande@xxxxxxxxxx>; Manoj Vishwanathan
> <manojvishy@xxxxxxxxxx>; Keller, Jacob E <jacob.e.keller@xxxxxxxxx>; Linga,
> Pavan Kumar <pavan.kumar.linga@xxxxxxxxx>
> Subject: Re: [Intel-wired-lan] [iwl-next PATCH v4 2/3] idpf: convert
> workqueues to unbound
>
> On Mon, Dec 16, 2024 at 1:11 PM Alexander Lobakin
> <aleksander.lobakin@xxxxxxxxx> wrote:
> >
> > From: Brian Vazquez <brianvv@xxxxxxxxxx>
> > Date: Mon, 16 Dec 2024 16:27:34 +0000
> >
> > > From: Marco Leogrande <leogrande@xxxxxxxxxx>
> > >
> > > When a workqueue is created with `WQ_UNBOUND`, its work items are
> > > served by special worker-pools, whose host workers are not bound to
> > > any specific CPU. In the default configuration (i.e. when
> > > `queue_delayed_work` and friends do not specify which CPU to run the
> > > work item on), `WQ_UNBOUND` allows the work item to be executed on
> any
> > > CPU in the same node of the CPU it was enqueued on. While this
> > > solution potentially sacrifices locality, it avoids contention with
> > > other processes that might dominate the CPU time of the processor the
> > > work item was scheduled on.
> > >
> > > This is not just a theoretical problem: in a particular scenario
> > > misconfigured process was hogging most of the time from CPU0, leaving
> > > less than 0.5% of its CPU time to the kworker. The IDPF workqueues
> > > that were using the kworker on CPU0 suffered large completion delays
> > > as a result, causing performance degradation, timeouts and eventual
> > > system crash.
> >
> > Wasn't this inspired by [0]?
> >
> > [0]
> > https://lore.kernel.org/netdev/20241126035849.6441-11-
> milena.olech@xxxxxxxxx
>
> The root cause is exactly the same so I do see the similarity and I'm
> not surprised that both were addressed with a similar patch, we hit
> this problem some time ago and the first attempt to have this was in
> August [0].
>
> [0]
> https://lore.kernel.org/netdev/20240813182747.1770032-4-
> manojvishy@xxxxxxxxxx/
>
> >
> > Thanks,
> > Olek
Tested-by: Krishneil Singh <krishneil.k.singh@xxxxxxxxx>