Re: [PATCH RFC 0/5] workqueue: add WQ_AFFN_CACHE_SHARD affinity scope
From: Tejun Heo
Date: Fri Mar 13 2026 - 14:00:40 EST
Hello,
Applied 1/5. Some comments on the rest:
- The sharding currently splits on CPU boundary, which can split SMT
siblings across different pods. The worse performance on Intel compared
to SMT scope may be indicating exactly this - HT siblings ending up in
different pods. It'd be better to shard on core boundary so that SMT
siblings always stay together.
- How was the default shard size of 8 picked? There's a tradeoff between
the number of kworkers created and locality. Can you also report the
number of kworkers for each configuration? And is there data on
different shard sizes? It'd be useful to see how the numbers change
across e.g. 4, 8, 16, 32.
- Can you also test on AMD machines? Their CCD topology (16 or 32
threads per LLC) would be a good data point.
Thanks.
--
tejun