Re: [powerpc] ftrace warning kernel/trace/ftrace.c:2068 with code-patching selftests
From: Mark Rutland
Date: Thu Jan 27 2022 - 07:59:49 EST
On Thu, Jan 27, 2022 at 01:22:17PM +0100, Ard Biesheuvel wrote:
> On Thu, 27 Jan 2022 at 13:20, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > On Thu, Jan 27, 2022 at 01:03:34PM +0100, Ard Biesheuvel wrote:
> >
> > > These architectures use place-relative extables for the same reason:
> > > place relative references are resolved at build time rather than at
> > > runtime during relocation, making a build time sort feasible.
> > >
> > > arch/alpha/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/arm64/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/ia64/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/parisc/include/asm/uaccess.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/powerpc/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/riscv/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/s390/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/x86/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > >
> > > Note that the swap routine becomes something like the below, given
> > > that the relative references need to be fixed up after the entry
> > > changes place in the sorted list.
> > >
> > > static void swap_ex(void *a, void *b, int size)
> > > {
> > > struct exception_table_entry *x = a, *y = b, tmp;
> > > int delta = b - a;
> > >
> > > tmp = *x;
> > > x->insn = y->insn + delta;
> > > y->insn = tmp.insn - delta;
> > > ...
> > > }
> > >
> > > As a bonus, the resulting footprint of the table in the image is
> > > reduced by 8x, given that every 8 byte pointer has an accompanying 24
> > > byte RELA record, so we go from 32 bytes to 4 bytes for every call to
> > > __gnu_mcount_mc.
> >
> > Absolutely -- it'd be great if we could do that for the callsite locations; the
> > difficulty is that the entries are generated by the compiler itself, so we'd
> > either need some build/link time processing to convert each absolute 64-bit
> > value to a relative 32-bit offset, or new compiler options to generate those as
> > relative offsets from the outset.
>
> Don't we use scripts/recordmcount.pl for that?
Not quite -- we could adjust it to do so, but today it doesn't consider
existing mcount_loc entries, and just generates new ones where the compiler has
generated calls to mcount, which it finds by scanning the instructions in the
binary. Today it is not used:
* On arm64 when we default to using `-fpatchable-function-entry=N`. That makes
the compiler insert 2 NOPs in the function prologue, and log the location of
that NOP sled to a section called. `__patchable_function_entries`.
We need the compiler to do that since we can't reliably identify 2 NOPs in a
function prologue as being intended to be a patch site, as e.g. there could
be notrace functions where the compiler had to insert NOPs for alignment of a
subsequent brnach or similar.
* On architectures with `-nop-mcount`. On these, it's necessary to use
`-mrecord-mcount` to have the compiler log the patch-site, for the same
reason as with `-fpatchable-function-entry`.
* On architectures with `-mrecord-mcount` generally, since this avoids the
overhead of scanning each object.
* On x86 when objtool is used.
Thanks,
Mark.