Re: [PATCH v2 2/5] workqueue: add WQ_AFFN_CACHE_SHARD affinity scope

Next message: Rob Herring: "Re: [PATCH v3 1/2] media: dt-bindings: rockchip,rk3568-mipi-csi2: add rk3588 compatible"
Previous message: Detlev Casanova: "Re: [PATCH] arm64: defconfig: Enable Rockchip video decoder"
In reply to: Breno Leitao: "Re: [PATCH v2 2/5] workqueue: add WQ_AFFN_CACHE_SHARD affinity scope"
Next in thread: Breno Leitao: "Re: [PATCH v2 2/5] workqueue: add WQ_AFFN_CACHE_SHARD affinity scope"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Tejun Heo

Date: Thu Mar 26 2026 - 15:47:17 EST

On Thu, Mar 26, 2026 at 09:20:15AM -0700, Breno Leitao wrote:
> Assuming wq_cache_shard_size = 8;, we would have the following number of pool
> per number of CPU (not vCPU):
>
> - 1–11 CPUs → DIV_ROUND_CLOSEST(n, 8) ≤ 1 → 1 pool containing all CPUs.
> - 12 CPUs → DIV_ROUND_CLOSEST(12, 8) = 2 → 2 pools of 6 cores each. This is the first split.
> - 12–19 → 2 pools
> - 20–27 → 3 pools
> - 28–35 → 4 pools
> - 36–43 → 5 pools
> - 44–51 → 6 pools
> - 52–59 → 7 pools
> - 60–67 → 8 pools
> - 68–75 → 9 pools (e.g. 72-CPU NVIDIA Grace → 9×8)
> - 76–83 → 10 pools
> - 84–91 → 11 pools
> - 92–99 → 12 pools
> - 100 → 13 pools (9×8 + 4×7)
>
> Is this what you meant?

Yes.

> +static int __init llc_core_to_shard(int core_pos, int cores_per_shard,
> + int remainder)
> +{
> + int ret;
> +
> + /*
> + * These cores falls within the large shards.
> + * Each large shard has (cores_per_shard + 1) cores
> + */
> + if (core_pos < remainder * (cores_per_shard + 1))
> + return core_pos / (cores_per_shard + 1);
> +
> + /* These are standard shards */
> + ret = (core_pos - remainder * (cores_per_shard + 1)) / cores_per_shard;

This is too smart. Any chance you can dumb it down? If you have to go
through intermediate data structures, that's fine too.

Thanks.

--
tejun