Re: [PATCH v3 00/11] Add a percpu subsection for cache hot data
From: Uros Bizjak
Date: Tue Mar 04 2025 - 11:31:36 EST
On Tue, Mar 4, 2025 at 4:00 PM Brian Gerst <brgerst@xxxxxxxxx> wrote:
>
> On Tue, Mar 4, 2025 at 4:55 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> >
> > * Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >
> > > On Tue, Mar 4, 2025 at 10:48 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > > >
> > > >
> > > > * Brian Gerst <brgerst@xxxxxxxxx> wrote:
> > > >
> > > > > On Tue, Mar 4, 2025 at 3:47 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > > > > >
> > > > > >
> > > > > > * Brian Gerst <brgerst@xxxxxxxxx> wrote:
> > > > > >
> > > > > > > >
> > > > > > > > - PERCPU_SECTION(INTERNODE_CACHE_BYTES)
> > > > > > > > + PERCPU_SECTION(L1_CACHE_BYTES)
> > > > > > > > ASSERT(__per_cpu_hot_end - __per_cpu_hot_start <= 64, "percpu cache hot section too large")
> > > > > > > >
> > > > > > > > RUNTIME_CONST_VARIABLES
> > > > > > > >
> > > > > > >
> > > > > > > That is probably the right call. The initial percpu section is just
> > > > > > > used by the boot cpu early and as a template for the dynamically
> > > > > > > allocated percpu memory, which should account for the proper
> > > > > > > alignment for NUMA.
> > > > > >
> > > > > > Okay.
> > > > > >
> > > > > > Randconfig testing found another corner case with the attached config:
> > > > > >
> > > > > > KSYMS .tmp_vmlinux0.kallsyms.S
> > > > > > AS .tmp_vmlinux0.kallsyms.o
> > > > > > LD .tmp_vmlinux1
> > > > > > ld: percpu cache hot section too large
> > > > > > make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
> > > > > >
> > > > > > (I haven't figured out the root cause yet.)
> > > > >
> > > > > CONFIG_MPENTIUM4 sets X86_L1_CACHE_SHIFT to 7 (128 bytes).
> > > >
> > > > Hm, to resolve this I'd go for the easy out of explicitly using '64' as
> > > > the size limit - like we did it in the C space.
> > >
> > > Why not simply:
> > >
> > > ASSERT(__per_cpu_hot_end - __per_cpu_hot_start <= L1_CACHE_BYTES, "...")
> > >
> > > ?
> >
> > I don't think it's a great idea to randomly allow a larger section
> > depending on the .config ... The *actual* intended limit is 64, not 128
> > and not 4096, so I'd suggest we write it out as before.
>
> Change the assert to:
> ASSERT(__per_cpu_hot_pad - __per_cpu_hot_start <= 64, "percpu
> cache hot section too large")
>
> We only care about the used portion, not the padded end.
If this is the case, perhaps it is better to use __per_cpu_hot_end
to mark the end of the real data, as in the attached patch.
Uros.
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 4ed0e6a013d0..58a635a6d5bd 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1071,9 +1071,8 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
. = ALIGN(cacheline); \
__per_cpu_hot_start = .; \
*(SORT_BY_ALIGNMENT(.data..percpu..hot.*)) \
- __per_cpu_hot_pad = .; \
- . = ALIGN(cacheline); \
__per_cpu_hot_end = .; \
+ . = ALIGN(cacheline); \
*(.data..percpu..read_mostly) \
. = ALIGN(cacheline); \
*(.data..percpu) \