Re: [PATCH 07/14] x86/ibt: Clean up is_endbr()

From: Peter Zijlstra
Date: Mon Sep 30 2024 - 04:30:50 EST


On Sun, Sep 29, 2024 at 10:32:38AM -0700, Alexei Starovoitov wrote:
> On Fri, Sep 27, 2024 at 12:50 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1027,19 +1027,9 @@ static const struct bpf_func_proto bpf_g
> > #ifdef CONFIG_X86_KERNEL_IBT
> > static unsigned long get_entry_ip(unsigned long fentry_ip)
> > {
> > - u32 instr;
> > + if (is_endbr((void *)(fentry_ip - ENDBR_INSN_SIZE)))
> > + return fentry_ip - ENDBR_INSN_SIZE;
> >
> > - /* We want to be extra safe in case entry ip is on the page edge,
> > - * but otherwise we need to avoid get_kernel_nofault()'s overhead.
> > - */
> > - if ((fentry_ip & ~PAGE_MASK) < ENDBR_INSN_SIZE) {
> > - if (get_kernel_nofault(instr, (u32 *)(fentry_ip - ENDBR_INSN_SIZE)))
> > - return fentry_ip;
> > - } else {
> > - instr = *(u32 *)(fentry_ip - ENDBR_INSN_SIZE);
> > - }
> > - if (is_endbr(instr))
> > - fentry_ip -= ENDBR_INSN_SIZE;
> > return fentry_ip;
>
> Pls don't.
>
> This re-introduces the overhead that we want to avoid.
>
> Just call __is_endbr() here and keep the optimization.

Well, I could do that ofcourse, but as I wrote elsewhere, the right
thing to do is to optimize get_kernel_nofault(), its current
implementation is needlessly expensive. All we really need is a load
with an exception entry, the STAC/CLAC and speculation nonsense should
not be needed.

Fixing get_kernel_nofault() benefits all users.