Re: [PATCH RFC 0/5] workqueue: add WQ_AFFN_CACHE_SHARD affinity scope

From: Tejun Heo

Date: Fri Mar 13 2026 - 14:00:40 EST


Hello,

Applied 1/5. Some comments on the rest:

- The sharding currently splits on CPU boundary, which can split SMT
siblings across different pods. The worse performance on Intel compared
to SMT scope may be indicating exactly this - HT siblings ending up in
different pods. It'd be better to shard on core boundary so that SMT
siblings always stay together.

- How was the default shard size of 8 picked? There's a tradeoff between
the number of kworkers created and locality. Can you also report the
number of kworkers for each configuration? And is there data on
different shard sizes? It'd be useful to see how the numbers change
across e.g. 4, 8, 16, 32.

- Can you also test on AMD machines? Their CCD topology (16 or 32
threads per LLC) would be a good data point.

Thanks.

--
tejun