Re: [PATCH] LoongArch: Enhance robust of kprobe

From: Tiezhu Yang
Date: Thu Apr 10 2025 - 22:49:02 EST


On 04/09/2025 10:17 AM, Jinyang He wrote:
On 2025-04-08 17:27, Tiezhu Yang wrote:

Currently, interrupts need to be disabled before single-step mode is set,
it requires that the CSR_PRMD_PIE must be cleared in save_local_irqflag()
which is called by setup_singlestep(), this is reasonable.

But in the first kprobe breakpoint exception, if the irq is enabled at
the
beginning of do_bp(), it will not be disabled at the end of do_bp()
due to
the CSR_PRMD_PIE has been cleared in save_local_irqflag(). For this case,
it may corrupt exception context when restoring exception after
do_bp() in
handle_bp(), this is not reasonable.

Based on the above analysis, in order to make sure the irq is disabled at
the end of do_bp() for the first kprobe breakpoint exception, it is
proper
to disable irq first before clearing CSR_PRMD_PIE in
save_local_irqflag().

Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Co-developed-by: Tianyang Zhang <zhangtianyang@xxxxxxxxxxx>
Signed-off-by: Tianyang Zhang <zhangtianyang@xxxxxxxxxxx>
Signed-off-by: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
---
arch/loongarch/kernel/kprobes.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/loongarch/kernel/kprobes.c
b/arch/loongarch/kernel/kprobes.c
index 8ba391cfabb0..6eab97636e6b 100644
--- a/arch/loongarch/kernel/kprobes.c
+++ b/arch/loongarch/kernel/kprobes.c
@@ -113,6 +113,7 @@ NOKPROBE_SYMBOL(set_current_kprobe);
static void save_local_irqflag(struct kprobe_ctlblk *kcb,
struct pt_regs *regs)
{
+ local_irq_disable();
kcb->saved_status = regs->csr_prmd;
regs->csr_prmd &= ~CSR_PRMD_PIE;
}

Hi, Tiezhu,

I think the carsh is caused by "irq-triggered re-re-enter" clear
the previous_kprobe status. An example things like,

...
static void setup_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb, int reenter)
{
union loongarch_instruction insn;

if (reenter) {
save_previous_kprobe(kcb);
=================== <- irq and trigger re-re-enter in its handler
set_current_kprobe(p);
kcb->kprobe_status = KPROBE_REENTER;
} else {
kcb->kprobe_status = KPROBE_HIT_SS;
}
...

We should assure the previous_kprobe status not be changed after re-enter.
So this `local_irq_disable` should be set in reenter block begin.
And for !reenter block, `local_irq_disable` may be not needed.

If you mean to do the following change:

diff --git a/arch/loongarch/kernel/kprobes.c b/arch/loongarch/kernel/kprobes.c
index 8ba391cfabb0..1ad67a3c7b70 100644
--- a/arch/loongarch/kernel/kprobes.c
+++ b/arch/loongarch/kernel/kprobes.c
@@ -158,6 +158,7 @@ static void setup_singlestep(struct kprobe *p, struct pt_regs *regs,
union loongarch_instruction insn;

if (reenter) {
+ local_irq_disable();
save_previous_kprobe(kcb);
set_current_kprobe(p);
kcb->kprobe_status = KPROBE_REENTER;

then it can not fix the case Tianyang and I have met.

With the above change, the issue of kernel hang can
still be reproduced after a few hours.

With the original change of this patch, the issue of
kernel hang can not be reproduced for several days,
it works well.

Maybe what you said is another case, but anyway, the
original change of this patch is safe for any case
because its coverage is more extensive, so just keep
it as is.

Thanks,
Tiezhu