Re: [PATCH] RISC-V: prevent sbi_send_cpumask_ipi race with ftrace

From: Dimitri John Ledkov
Date: Fri Aug 13 2021 - 13:13:52 EST


Hi,

On Thu, Aug 12, 2021 at 4:53 PM Atish Patra <atishp@xxxxxxxxxxxxxx> wrote:
>
> On Thu, Aug 12, 2021 at 5:36 AM Dimitri John Ledkov
> <dimitri.ledkov@xxxxxxxxxxxxx> wrote:
> >
> > From: Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxx>
> >
> > ftrace will patch instructions in sbi_send_cpumask_ipi, which is going to
> > be used by flush_icache_range, leading to potential races and crashes like
> > this:
> >
> > [ 0.000000] ftrace: allocating 38893 entries in 152 pages
> > [ 0.000000] Oops - illegal instruction [#1]
> > [ 0.000000] Modules linked in:
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.11.0-1014-generic #14-Ubuntu
> > [ 0.000000] epc: ffffffe00000920e ra : ffffffe000009384 sp : ffffffe001803d30
> > [ 0.000000] gp : ffffffe001a14240 tp : ffffffe00180f440 t0 : ffffffe07fe38000
> > [ 0.000000] t1 : ffffffe0019cd338 t2 : 0000000000000000 s0 : ffffffe001803d70
> > [ 0.000000] s1 : 0000000000000000 a0 : ffffffe0000095aa a1 : 0000000000000001
> > [ 0.000000] a2 : 0000000000000002 a3 : 0000000000000000 a4 : 0000000000000000
> > [ 0.000000] a5 : 0000000000000000 a6 : 0000000000000004 a7 : 0000000052464e43
> > [ 0.000000] s2 : 0000000000000002 s3 : 0000000000000001 s4 : 0000000000000000
> > [ 0.000000] s5 : 0000000000000000 s6 : 0000000000000000 s7 : 0000000000000000
> > [ 0.000000] s8 : ffffffe001a170c0 s9 : 0000000000000001 s10: 0000000000000001
> > [ 0.000000] s11: 00000000fffcc5d0 t3 : 0000000000000068 t4 : 000000000000000b
> > [ 0.000000] t5 : ffffffe0019cd3e0 t6 : ffffffe001803cd8
> > [ 0.000000] status: 0000000200000100 badaddr: 000000000513f187 cause: 0000000000000002
> > [ 0.000000] ---[ end trace f67eb9af4d8d492b ]---
> > [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> > [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> >
> > Where ffffffe00000920e lies in the middle of sbi_send_cpumask_ipi.
> >
> > Reproduced on Unmatched board using Ubuntu kernels. See
> > https://people.canonical.com/~xnox/lp1934548/ for sample images,
> > kernels, debug symbols.
> >
> > BugLink: https://bugs.launchpad.net/bugs/1934548
> > Reported-by: Pierce Andjelkovic <pierceandjelkovic@xxxxxxxxx>
> > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxx>
> > Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@xxxxxxxxxxxxx>
> > cc: Paul Walmsley <paul.walmsley@xxxxxxxxxx>
> > cc: linux-riscv@xxxxxxxxxxxxxxxxxxx
> > cc: stable@xxxxxxxxxxxxxxx
> > ---
> > arch/riscv/kernel/sbi.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> > index 7402a417f38e..158199865c68 100644
> > --- a/arch/riscv/kernel/sbi.c
> > +++ b/arch/riscv/kernel/sbi.c
> > @@ -562,7 +562,7 @@ long sbi_get_mimpid(void)
> > return __sbi_base_ecall(SBI_EXT_BASE_GET_MIMPID);
> > }
> >
> > -static void sbi_send_cpumask_ipi(const struct cpumask *target)
> > +static void notrace sbi_send_cpumask_ipi(const struct cpumask *target)
> > {
> > struct cpumask hartid_mask;
> >
>
> flush_icache_range doesn't invoke sbi_send_cpumask_ipi.
> flush_icache_range->flush_icache_all->sbi_remote_fence_i->__sbi_rfence->sbi_ecall
>
> Moreover, sbi.c should be removed from ftrace path as it is compiled
> with notrace flag after the patch [1]
>
> CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE)
>
> This solution was proposed as a result of earlier discussion [2] last year.
>
> [1] https://patchwork.kernel.org/project/linux-riscv/patch/1608220905-1962-5-git-send-email-guoren@xxxxxxxxxx/
> [2] https://lkml.org/lkml/2020/11/3/735
>
> The proposed fix probably hiding the root cause somehow.
>
> Do you have the patch[1] in your kernel ?
>

We do not. I have applied and tested it, and indeed it resolves the
boot issue too. And it does make a lot more sense. I see that it and
its prereq patch did not have CC: stable on them. I will submit them
to stable, as they are required to have bootable kernels. Thanks a lot
for the pointers.

So this patch that I sent is NACKed now.
--
Regards,

Dimitri.