Re: [PATCH] x86: Implement _THIS_IP_ using inline asm for 32-bit

From: Marco Elver

Date: Thu May 21 2026 - 04:40:46 EST


On Thu, 21 May 2026 at 09:09, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, May 21, 2026 at 02:00:09AM +0200, Marco Elver wrote:
> > Both GCC [1] and Clang [2] consider the generic version of _THIS_IP_ to
> > be broken:
> >
> > #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
> >
> > In particular, the address of a label is only expected to be used with a
> > computed goto.
> >
> > While the generic version more or less works today, it is known to be
> > brittle and may break with current and future optimizations. For
> > example, Clang -O2 always returns 1 when this function is inlined:
> >
> > static inline unsigned long get_ip(void)
> > { return ({ __label__ __here; __here: (unsigned long)&&__here; }); }
> >
>
> Oh gawd :/
>
> > Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 [1]
> > Link: https://github.com/llvm/llvm-project/issues/138272 [2]
> > Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
> > ---
> > arch/x86/include/asm/linkage.h | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
> > index a7294656ad90..bce3c6f4b94f 100644
> > --- a/arch/x86/include/asm/linkage.h
> > +++ b/arch/x86/include/asm/linkage.h
> > @@ -13,11 +13,12 @@
> > * The generic version tends to create spurious ENDBR instructions under
> > * certain conditions.
> > */
> > -#define _THIS_IP_ ({ unsigned long __here; asm ("lea 0(%%rip), %0" : "=r" (__here)); __here; })
> > +#define _THIS_IP_ ({ unsigned long __here; asm volatile("lea 0(%%rip), %0" : "=r" (__here)); __here; })
> > #endif
> >
> > #ifdef CONFIG_X86_32
> > #define asmlinkage CPP_ASMLINKAGE __attribute__((regparm(0)))
> > +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("call 1f\n1: pop %0" : "=r" (__ip)); __ip; })
>
> This will mess up the RSB and cause bad performance ripple effects for a
> bit each use. Now, I don't think anybody still cares about performance
> on 32bit (I certainly don't), so perhaps this is fine. But urgh.

Yeah - up to you. GCC appears to do the right thing still even for
32-bit: https://godbolt.org/z/3PWPK8E4f

Here's "Clang returns 1": https://godbolt.org/z/KjMvEWeM5