Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:
On Tue, Oct 1, 2024 at 12:18 AM Hari Bathini <hbathini@xxxxxxxxxxxxx> wrote:
On 30/09/24 6:25 pm, Alexei Starovoitov wrote:
On Sun, Sep 29, 2024 at 10:33 PM Hari Bathini <hbathini@xxxxxxxxxxxxx> wrote:
On 17/09/24 1:20 pm, Alexei Starovoitov wrote:
On Sun, Sep 15, 2024 at 10:58 PM Hari Bathini <hbathini@xxxxxxxxxxxxx> wrote:
...
+
+ /*
+ * Generated stack layout:
+ *
+ * func prev back chain [ back chain ]
+ * [ ]
+ * bpf prog redzone/tailcallcnt [ ... ] 64 bytes (64-bit powerpc)
+ * [ ] --
+
+ /* Dummy frame size for proper unwind - includes 64-bytes red zone for 64-bit powerpc */
+ bpf_dummy_frame_size = STACK_FRAME_MIN_SIZE + 64;
What is the goal of such a large "red zone" ?
The kernel stack is a limited resource.
Why reserve 64 bytes ?
tail call cnt can probably be optional as well.
Hi Alexei, thanks for reviewing.
FWIW, the redzone on ppc64 is 288 bytes. BPF JIT for ppc64 was using
a redzone of 80 bytes since tailcall support was introduced [1].
It came down to 64 bytes thanks to [2]. The red zone is being used
to save NVRs and tail call count when a stack is not setup. I do
agree that we should look at optimizing it further. Do you think
the optimization should go as part of PPC64 trampoline enablement
being done here or should that be taken up as a separate item, maybe?
The follow up is fine.
It just odd to me that we currently have:
[ unused red zone ] 208 bytes protected
I simply don't understand why we need to waste this much stack space.
Why can't it be zero today ?
The ABI for ppc64 has a redzone of 288 bytes below the current
stack pointer that can be used as a scratch area until a new
stack frame is created. So, no wastage of stack space as such.
It is just red zone that can be used before a new stack frame
is created. The comment there is only to show how redzone is
being used in ppc64 BPF JIT. I think the confusion is with the
mention of "208 bytes" as protected. As not all of that scratch
area is used, it mentions the remaining as unused. Essentially
288 bytes below current stack pointer is protected from debuggers
and interrupt code (red zone). Note that it should be 224 bytes
of unused red zone instead of 208 bytes as red zone usage in
ppc64 BPF JIT come down from 80 bytes to 64 bytes since [2].
Hope that clears the misunderstanding..
I see. That makes sense. So it's similar to amd64 red zone,
but there we have an issue with irqs, hence the kernel is
compiled with -mno-red-zone.
I assume that issue is that the interrupt entry unconditionally writes
some data below the stack pointer, disregarding the red zone?
I guess ppc always has a different interrupt stack and
it's not an issue?
No, the interrupt entry allocates a frame that is big enough to cover
the red zone as well as the space it needs to save registers.
See STACK_INT_FRAME_SIZE which includes KERNEL_REDZONE_SIZE:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/include/asm/ptrace.h?commit=8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b#n165
Which is renamed to INT_FRAME_SIZE in asm-offsets.c and then is used in
the interrupt entry here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/exceptions-64s.S?commit=8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b#n497