[RFC PATCH 00/11] Add a percpu subsection for hot data
From: Brian Gerst
Date: Sat Feb 22 2025 - 14:06:45 EST
Add a new percpu subsection for data that is frequently accessed and
exclusive to each processor. This is intended to replace the pcpu_hot
struct on X86, and is available to all architectures.
The one caveat with this approach is that it depends on the linker to
effeciently pack data that is smaller than machine word size. The
binutils linker does this properly:
ffffffff842f6000 D __per_cpu_hot_start
ffffffff842f6000 D softirq_pending
ffffffff842f6002 D hardirq_stack_inuse
ffffffff842f6008 D hardirq_stack_ptr
ffffffff842f6010 D __ref_stack_chk_guard
ffffffff842f6010 D __stack_chk_guard
ffffffff842f6018 D const_cpu_current_top_of_stack
ffffffff842f6018 D cpu_current_top_of_stack
ffffffff842f6020 D const_current_task
ffffffff842f6020 D current_task
ffffffff842f6028 D __preempt_count
ffffffff842f602c D cpu_number
ffffffff842f6030 D this_cpu_off
ffffffff842f6038 D __x86_call_depth
ffffffff842f6040 D __per_cpu_hot_end
The LLVM linker doesn't do as well with packing smaller data objects,
causing it to spill over into a second cacheline.
Brian Gerst (11):
percpu: Introduce percpu hot section
x86/preempt: Move preempt count to percpu hot section
x86/smp: Move cpu number to percpu hot section
x86/retbleed: Move call depth to percpu hot section
x86/percpu: Move top_of_stack to percpu hot section
x86/percpu: Move current_task to percpu hot section
x86/softirq: Move softirq_pending to percpu hot section
x86/irq: Move irq stacks to percpu hot section
x86/percpu: Remove pcpu_hot
x86/stackprotector: Move __stack_chk_guard to percpu hot section
x86/smp: Move this_cpu_off to percpu hot section
arch/x86/entry/entry_32.S | 4 +--
arch/x86/entry/entry_64.S | 6 ++---
arch/x86/entry/entry_64_compat.S | 4 +--
arch/x86/include/asm/current.h | 35 ++++-----------------------
arch/x86/include/asm/hardirq.h | 3 ++-
arch/x86/include/asm/irq_stack.h | 12 ++++-----
arch/x86/include/asm/nospec-branch.h | 10 +++++---
arch/x86/include/asm/percpu.h | 4 +--
arch/x86/include/asm/preempt.h | 25 ++++++++++---------
arch/x86/include/asm/processor.h | 15 ++++++++++--
arch/x86/include/asm/smp.h | 7 +++---
arch/x86/include/asm/stackprotector.h | 2 +-
arch/x86/kernel/asm-offsets.c | 5 ----
arch/x86/kernel/callthunks.c | 3 +++
arch/x86/kernel/cpu/common.c | 17 +++++++------
arch/x86/kernel/dumpstack_32.c | 4 +--
arch/x86/kernel/dumpstack_64.c | 2 +-
arch/x86/kernel/head_64.S | 4 +--
arch/x86/kernel/irq.c | 8 ++++++
arch/x86/kernel/irq_32.c | 12 +++++----
arch/x86/kernel/irq_64.c | 6 ++---
arch/x86/kernel/process_32.c | 6 ++---
arch/x86/kernel/process_64.c | 6 ++---
arch/x86/kernel/setup_percpu.c | 7 ++++--
arch/x86/kernel/smpboot.c | 4 +--
arch/x86/kernel/vmlinux.lds.S | 5 +++-
arch/x86/lib/retpoline.S | 2 +-
include/asm-generic/vmlinux.lds.h | 10 ++++++++
include/linux/percpu-defs.h | 10 ++++++++
kernel/bpf/verifier.c | 4 +--
scripts/gdb/linux/cpus.py | 2 +-
31 files changed, 135 insertions(+), 109 deletions(-)
base-commit: 01157ddc58dc2fe428ec17dd5a18cc13f134639f
--
2.48.1