Re: [PATCH v3 00/11] Add a percpu subsection for cache hot data

From: Ingo Molnar
Date: Mon Mar 03 2025 - 15:39:09 EST



* Brian Gerst <brgerst@xxxxxxxxx> wrote:

> Add a new percpu subsection for data that is frequently accessed and
> exclusive to each processor. This replaces the pcpu_hot struct on x86,
> and is available to all architectures and the core kernel.
>
> ffffffff834f5000 D __per_cpu_hot_start
> ffffffff834f5000 D hardirq_stack_ptr
> ffffffff834f5008 D __ref_stack_chk_guard
> ffffffff834f5008 D __stack_chk_guard
> ffffffff834f5010 D const_cpu_current_top_of_stack
> ffffffff834f5010 D cpu_current_top_of_stack
> ffffffff834f5018 D const_current_task
> ffffffff834f5018 D current_task
> ffffffff834f5020 D __x86_call_depth
> ffffffff834f5028 D this_cpu_off
> ffffffff834f5030 D __preempt_count
> ffffffff834f5034 D cpu_number
> ffffffff834f5038 D __softirq_pending
> ffffffff834f503a D hardirq_stack_inuse
> ffffffff834f503b D __per_cpu_hot_pad
> ffffffff834f5040 D __per_cpu_hot_end
>
> This applies to the tip/x86/asm branch.
>
> Changes in v3:
> - Fix typo of CACHE_HOT_DATA()
> - Move hardirq_stack_inuse to irq_64.c
> - Add __per_cpu_hot_pad to show the end of the actual data
>
> Brian Gerst (11):
> percpu: Introduce percpu hot section
> x86/percpu: Move pcpu_hot to percpu hot section
> x86/preempt: Move preempt count to percpu hot section
> x86/smp: Move cpu number to percpu hot section
> x86/retbleed: Move call depth to percpu hot section
> x86/softirq: Move softirq_pending to percpu hot section
> x86/irq: Move irq stacks to percpu hot section
> x86/percpu: Move top_of_stack to percpu hot section
> x86/percpu: Move current_task to percpu hot section
> x86/stackprotector: Move __stack_chk_guard to percpu hot section
> x86/smp: Move this_cpu_off to percpu hot section

> 31 files changed, 146 insertions(+), 111 deletions(-)

Yeah, so the result is that on x86-64 allmodconfig we now get:

ld: percpu cache hot section too large

See the relevant .tmp_vmlinux1.map below.

Which is due to:

CONFIG_X86_INTERNODE_CACHE_SHIFT=12

Increasing 'cache alignment' to 4096 bytes:

PERCPU_SECTION(INTERNODE_CACHE_BYTES)

... because of the vSMP muck:

config X86_INTERNODE_CACHE_SHIFT
int
default "12" if X86_VSMP
default X86_L1_CACHE_SHIFT

The workaround would be to use L1_CACHE_BYTES in, but I really dislike
what vSMP is doing here.

Anyway, I applied the short-term fix to patch 02/11, but I'm not sure
it's the right fix.

Thanks,

Ingo

=====================>
0xffffffff8664f000 . = ALIGN (0x1000)
0xffffffff8664f000 __per_cpu_hot_start = .
*(SORT_BY_ALIGNMENT(.data..percpu..hot.*))
.data..percpu..hot..hardirq_stack_ptr
0xffffffff8664f000 0x8 vmlinux.o
0xffffffff8664f000 hardirq_stack_ptr
.data..percpu..hot..__stack_chk_guard
0xffffffff8664f008 0x8 vmlinux.o
0xffffffff8664f008 __stack_chk_guard
.data..percpu..hot..cpu_current_top_of_stack
0xffffffff8664f010 0x8 vmlinux.o
0xffffffff8664f010 cpu_current_top_of_stack
.data..percpu..hot..current_task
0xffffffff8664f018 0x8 vmlinux.o
0xffffffff8664f018 current_task
.data..percpu..hot..__x86_call_depth
0xffffffff8664f020 0x8 vmlinux.o
0xffffffff8664f020 __x86_call_depth
.data..percpu..hot..this_cpu_off
0xffffffff8664f028 0x8 vmlinux.o
0xffffffff8664f028 this_cpu_off
.data..percpu..hot..__preempt_count
0xffffffff8664f030 0x4 vmlinux.o
0xffffffff8664f030 __preempt_count
.data..percpu..hot..cpu_number
0xffffffff8664f034 0x4 vmlinux.o
0xffffffff8664f034 cpu_number
.data..percpu..hot..__softirq_pending
0xffffffff8664f038 0x2 vmlinux.o
0xffffffff8664f038 __softirq_pending
.data..percpu..hot..hardirq_stack_inuse
0xffffffff8664f03a 0x1 vmlinux.o
0xffffffff8664f03a hardirq_stack_inuse
0xffffffff8664f03b __per_cpu_hot_pad = .
0xffffffff86650000 . = ALIGN (0x1000)
*fill* 0xffffffff8664f03b 0xfc5
0xffffffff86650000 __per_cpu_hot_end = .
*(.data..percpu..read_mostly)
.data..percpu..read_mostly
0xffffffff86650000 0xa30 vmlinux.o

=================>

arch/x86/kernel/vmlinux.lds.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 9ac6b42701fa..31f9102b107f 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -330,7 +330,7 @@ SECTIONS
EXIT_DATA
}

- PERCPU_SECTION(INTERNODE_CACHE_BYTES)
+ PERCPU_SECTION(L1_CACHE_BYTES)
ASSERT(__per_cpu_hot_end - __per_cpu_hot_start <= 64, "percpu cache hot section too large")

RUNTIME_CONST_VARIABLES