Re: [PATCH -next V7 0/7] riscv: Optimize function trace
From: Guo Ren
Date: Wed Feb 08 2023 - 20:59:55 EST
On Thu, Feb 9, 2023 at 9:51 AM Guo Ren <guoren@xxxxxxxxxx> wrote:
>
> On Thu, Feb 9, 2023 at 6:29 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
> >
> > > > # Note: aligned to 8 bytes
> > > > addr-08 // Literal (first 32-bits) // patched to ops ptr
> > > > addr-04 // Literal (last 32-bits) // patched to ops ptr
> > > > addr+00 func: mv t0, ra
> > > We needn't "mv t0, ra" here because our "jalr" could work with t0 and
> > > won't affect ra. Let's do it in the trampoline code, and then we can
> > > save another word here.
> > > > addr+04 auipc t1, ftrace_caller
> > > > addr+08 jalr ftrace_caller(t1)
> >
> > Is that some kind of 'load high' and 'add offset' pair?
> Yes.
>
> > I guess 64bit kernels guarantee to put all module code
> > within +-2G of the main kernel?
> Yes, 32-bit is enough. So we only need one 32-bit literal size for the
> current rv64, just like CONFIG_32BIT.
We need kernel_addr_base + this 32-bit Literal.
@Mark Rutland
What do you think the idea about reducing one more 32-bit in
call-site? (It also sould work for arm64.)
>
> >
> > > Here is the call-site:
> > > # Note: aligned to 8 bytes
> > > addr-08 // Literal (first 32-bits) // patched to ops ptr
> > > addr-04 // Literal (last 32-bits) // patched to ops ptr
> > > addr+00 auipc t0, ftrace_caller
> > > addr+04 jalr ftrace_caller(t0)
> >
> > Could you even do something like:
> > addr-n call ftrace-function
> > addr-n+x literals
> > addr+0 nop or jmp addr-n
> > addr+4 function_code
> Yours cost one more instruction, right?
> addr-12 auipc
> addr-8 jalr
> addr-4 // Literal (32-bits)
> addr+0 nop or jmp addr-n // one more?
> addr+4 function_code
>
> > So that all the code executed when tracing is enabled
> > is before the label and only one 'nop' is in the body.
> > The called code can use the return address to find the
> > literals and then modify it to return to addr+4.
> > The code cost when trace is enabled is probably irrelevant
> > here - dominated by what happens later.
> > It probably isn't even worth aligning a 64bit constant.
> > Doing two reads probably won't be noticable.
> >
> > What you do want to ensure is that the initial patch is
> > overwriting nop - just in case the gap isn't there.
> >
> > David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > Registration No: 1397386 (Wales)
>
>
>
> --
> Best Regards
> Guo Ren
--
Best Regards
Guo Ren