[PATCH v1 1/4] LoongArch: BPF: Optimize redundant TCC loads in epilogue
From: Tiezhu Yang
Date: Mon Jun 29 2026 - 22:28:26 EST
The legacy epilogue implementation pops the tail call counter (TCC)
context via a redundant double-load pattern. It first decrements the
load_offset by 2 slots to fetch 'tcc_ptr', and then immediately bumps
it back up by 1 slot to load the original 'tcc' value into REG_TCC,
meaninglessly overwriting the register.
Optimize this sequence by adjusting the load_offset by only 1 slot.
This aligns the offset directly with the higher stack slot containing
the entry TCC counter (or caller state), allowing us to restore the
REG_TCC register safely with a single load.
This removes one redundant instruction from the epilogue hot path,
improves code readability, and ensures the correct TCC register
context is handed back cleanly upon normal return.
Signed-off-by: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
---
arch/loongarch/net/bpf_jit.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index ad7e28375aa9..719dc4ab7a98 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -246,14 +246,8 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call)
emit_insn(ctx, ldd, REG_ARENA, LOONGARCH_GPR_SP, load_offset);
}
- /*
- * When push into the stack, follow the order of tcc then tcc_ptr.
- * When pop from the stack, first pop tcc_ptr then followed by tcc.
- */
- load_offset -= 2 * sizeof(long);
- emit_insn(ctx, ldd, REG_TCC, LOONGARCH_GPR_SP, load_offset);
-
- load_offset += sizeof(long);
+ /* Only restore the TCC state into REG_TCC from the higher slot */
+ load_offset -= sizeof(long);
emit_insn(ctx, ldd, REG_TCC, LOONGARCH_GPR_SP, load_offset);
emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_adjust);
--
2.42.0