Re: [PATCH v1 1/3] LoongArch: BPF: Fix tail call prologue alignment

From: Tiezhu Yang

Date: Sat May 23 2026 - 05:19:49 EST


On 2026/5/21 下午3:01, Tiezhu Yang wrote:
Currently, the LoongArch BPF JIT assumes a fixed number of instructions to
skip when performing a tail call, but actually it does not account for the
difference between the main programs and the subprograms. The subprograms
do not initialize the Tail Call Count (TCC), leading to an incorrect jump
offset that skips functional instructions (like stack adjustment).

To fix this issue, add a NOP instruction in the subprogram prologue where
the TCC initialization would normally be. This ensures that both the main
programs and the subprograms have consistent prologue layouts, allowing
tail calls to always land on the correct instruction regardless of the
target program type.

Fixes: c0fcc955ff82 ("LoongArch: BPF: Fix the tailcall hierarchy")
Signed-off-by: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
---
arch/loongarch/net/bpf_jit.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 24913dc7f4e8..aefe5fa1c584 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -149,12 +149,15 @@ static void build_prologue(struct jit_ctx *ctx)
emit_insn(ctx, nop);
/*
- * First instruction initializes the tail call count (TCC)
- * register to zero. On tail call we skip this instruction,
- * and the TCC is passed in REG_TCC from the caller.
+ * Initialize or align the TCC register slot.
+ * For main programs, TCC is zeroed. For subprograms, a nop is emitted
+ * to keep the prologue size consistent, ensuring tail calls skip the
+ * correct number of instructions.
*/
if (is_main_prog)
emit_insn(ctx, addid, REG_TCC, LOONGARCH_GPR_ZERO, 0);
+ else
+ emit_insn(ctx, nop);
emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_adjust);

IMO, the code itself looks good, but the code comment and the commit
message are not accurate, I will send v2 next week, something like this:

----->8-----
LoongArch: BPF: Keep prologue size consistent

The BPF JIT assumes consistent prologue size to preserve text symmetry
and layout alignment across different function frames. But the current
JIT implementation emits a shortened prologue for subprograms, because
it skips the Tail Call Count (TCC) initialization required only by main
programs, resulting in a layout difference.

A tail call can never target a subprogram because the verifier strictly
prevents subprograms from being inserted into a BPF_MAP_TYPE_PROG_ARRAY.
Nevertheless, keeping the prologue size consistent is still vital for the
JIT infrastructure. Without this symmetry, either dynamic text patching
or reliable stack unwinding can miscalculate instruction boundaries,
leading to a kernel crash.

Fix this by emitting a NOP instruction in the subprogram prologue where
the TCC initialization would normally be. This ensures that both main
programs and subprograms keep the prologue size consistent, preserving
text symmetry and layout alignment across all function frames.

Signed-off-by: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
---
arch/loongarch/net/bpf_jit.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 24913dc7f4e8..976383a859e1 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -149,12 +149,15 @@ static void build_prologue(struct jit_ctx *ctx)
emit_insn(ctx, nop);

/*
- * First instruction initializes the tail call count (TCC)
- * register to zero. On tail call we skip this instruction,
- * and the TCC is passed in REG_TCC from the caller.
+ * Initialize or align the TCC register slot.
+ * For main programs, TCC is zeroed. For subprograms, a NOP is emitted
+ * to keep the prologue size consistent, preserving text symmetry and
+ * layout alignment across different function frames.
*/
if (is_main_prog)
emit_insn(ctx, addid, REG_TCC, LOONGARCH_GPR_ZERO, 0);
+ else
+ emit_insn(ctx, nop);

emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_adjust);

--

Thanks,
Tiezhu