Re: [PATCH 6.1] arch_topology: Build cacheinfo from primary CPU

From: Wen Yang

Date: Tue Sep 30 2025 - 11:45:37 EST




On 9/30/25 02:29, Greg Kroah-Hartman wrote:
On Tue, Sep 30, 2025 at 01:57:40AM +0800, Wen Yang wrote:


On 9/29/25 21:21, Greg Kroah-Hartman wrote:
On Sat, Sep 27, 2025 at 01:46:58AM +0800, Wen Yang wrote:
From: Pierre Gondois <pierre.gondois@xxxxxxx>

commit 5944ce092b97caed5d86d961e963b883b5c44ee2 upstream.


adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.

The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
the CPU topology")

allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.

When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].

Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.

[1]:
| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
| preempt_count: 1, expected: 0
| RCU nest depth: 1, expected: 1
| 3 locks held by swapper/111/0:
| #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
| #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
| #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
| irq event stamp: 0
| hardirqs last enabled at (0): 0x0
| hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last enabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last disabled at (0): 0x0
| Preemption disabled at:
| migrate_enable+0x30/0x130
| CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
| Call trace:
| __kmalloc+0xbc/0x1e8
| detect_cache_attributes+0x2d4/0x5f0
| update_siblings_masks+0x30/0x368
| store_cpu_topology+0x78/0xb8
| secondary_start_kernel+0xd0/0x198
| __secondary_switched+0xb0/0xb4

Signed-off-by: Pierre Gondois <pierre.gondois@xxxxxxx>
Reviewed-by: Sudeep Holla <sudeep.holla@xxxxxxx>
Acked-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@xxxxxxx
Signed-off-by: Sudeep Holla <sudeep.holla@xxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1.x: c3719bd:cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1.x: 8844c3d:cacheinfo: Return error code in init_of_cache_level(
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1.x: de0df44:cacheinfo: Check 'cache-unified' property to count cache leaves
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1.x: fa4d566:ACPI: PPTT: Remove acpi_find_cache_levels()
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1.x: bd50036:ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info(
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1.x

I do not understand, why do you want all of these applied as well? Can
you just send the full series of commits?

Thanks for your comments, here is the original series:
https://lore.kernel.org/all/167404285593.885445.6219705651301997538.b4-ty@xxxxxxx/

commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection in the
CPU hotplug path") introduced a bug, and this series fixed it.

Signed-off-by: Wen Yang <wen.yang@xxxxxxxxx>

Also, you have changed this commit a lot from the original one, please
document what you did here.

Thanks for the reminder. We just hope to cherry-pick them onto the 6.1
stable branch, without modifying the original commit.
Also checked again, as follows:

$ git cherry-pick c3719bd
$ git cherry-pick 8844c3d
$ git cherry-pick de0df44
$ git cherry-pick fa4d566
$ git cherry-pick bd50036
$ git cherry-pick 5944ce0

$ git format-patch HEAD -1

$ diff 0001-arch_topology-Build-cacheinfo-from-primary-CPU.patch
20250927_wen_yang_arch_topology_build_cacheinfo_from_primary_cpu.mbx


Can you resend these all as a patch series with your signed-off-by on
them to show that you have tested them?

And again, the commit here did not seem to match up with the original
upstream version, but maybe my tools got it wrong. Resend the series
and I'll check it again.


Thanks. We will resend this series soon.

--
Best wishes,
Wen