Re: [PATCH] x86: Implement _THIS_IP_ using inline asm for 32-bit
From: Marco Elver
Date: Thu May 21 2026 - 07:53:14 EST
On Thu, 21 May 2026 at 12:20, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, May 21, 2026 at 02:55:22AM -0700, H. Peter Anvin wrote:
> > On May 21, 2026 12:08:01 AM PDT, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >On Thu, May 21, 2026 at 02:00:09AM +0200, Marco Elver wrote:
> > >> Both GCC [1] and Clang [2] consider the generic version of _THIS_IP_ to
> > >> be broken:
> > >>
> > >> #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
> > >>
> > >> In particular, the address of a label is only expected to be used with a
> > >> computed goto.
> > >>
> > >> While the generic version more or less works today, it is known to be
> > >> brittle and may break with current and future optimizations. For
> > >> example, Clang -O2 always returns 1 when this function is inlined:
> > >>
> > >> static inline unsigned long get_ip(void)
> > >> { return ({ __label__ __here; __here: (unsigned long)&&__here; }); }
> > >>
> > >
> > >Oh gawd :/
> > >
> > >> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 [1]
> > >> Link: https://github.com/llvm/llvm-project/issues/138272 [2]
> > >> Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
> > >> ---
> > >> arch/x86/include/asm/linkage.h | 3 ++-
> > >> 1 file changed, 2 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
> > >> index a7294656ad90..bce3c6f4b94f 100644
> > >> --- a/arch/x86/include/asm/linkage.h
> > >> +++ b/arch/x86/include/asm/linkage.h
> > >> @@ -13,11 +13,12 @@
> > >> * The generic version tends to create spurious ENDBR instructions under
> > >> * certain conditions.
> > >> */
> > >> -#define _THIS_IP_ ({ unsigned long __here; asm ("lea 0(%%rip), %0" : "=r" (__here)); __here; })
> > >> +#define _THIS_IP_ ({ unsigned long __here; asm volatile("lea 0(%%rip), %0" : "=r" (__here)); __here; })
> > >> #endif
> > >>
> > >> #ifdef CONFIG_X86_32
> > >> #define asmlinkage CPP_ASMLINKAGE __attribute__((regparm(0)))
> > >> +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("call 1f\n1: pop %0" : "=r" (__ip)); __ip; })
> > >
> > >This will mess up the RSB and cause bad performance ripple effects for a
> > >bit each use. Now, I don't think anybody still cares about performance
> > >on 32bit (I certainly don't), so perhaps this is fine. But urgh.
> >
> > Most microarchitectures do *not* have a problem with call/pop, as they
> > know that call with a zero offset is not going to return. The main
> > exception was the Pentium 4.
>
> Oh, that's good to know. Still the "1: mov $1b, %reg" thing is shorter,
> and generates the exact same code the compilers used to (and GCC still
> does). Isn't that a better option?
It should work - just means it's going to emit relocations. If most
microarchitectures do in fact recognize the PIC variant and optimize
it, it might be better to avoid the relocations as it'd produce more
compact kernel images.
Also, while most kernel code doesn't need to be PIC (it's -fno-PIE),
there are a few special bits that are PIC (arch/x86/boot/startup ?),
so if you want this to be generic you need 2 versions guarded by
`#ifdef __PIC__`.