Re: [PATCH] LoongArch: Fix calling smp_processor_id() in preemptible code
From: Huacai Chen
Date: Thu Feb 26 2026 - 04:10:03 EST
Hi, Ruoyao,
On Wed, Feb 25, 2026 at 10:15 PM Xi Ruoyao <xry111@xxxxxxxxxxx> wrote:
>
> Fix the warning recently exploited by commit 4ab17e762b34 ("LoongArch:
> BPF: Use BPF prog pack allocator"):
>
> BUG: using smp_processor_id() in preemptible [00000000] code: systemd/1
> caller is larch_insn_text_copy+0x40/0xf0
>
> Simply changing it to raw_smp_processor_id() is not enough: if preempt
> and CPU hotplug happens after raw_smp_processor_id() but before
> stop_machine(), the CPU where raw_smp_processor_id() has run may be
> offline when stop_machine() and no CPU will run copy_to_kernel_nofault()
> in text_copy_cb(). Thus guard the larch_insn_text_copy() calls with
> cpus_read_lock() and change stop_machine() to stop_machine_cpuslocked()
> to prevent this.
>
> Fixes: 9fbd18cf4c69 ("LoongArch: BPF: Add dynamic code modification support")
> Signed-off-by: Xi Ruoyao <xry111@xxxxxxxxxxx>
> ---
>
> P.S. I've considered moving the locks inside larch_insn_text_copy() but
> doing so seems not an easy hack. In bpf_arch_text_poke() obviously the
> memcpy() call must be guarded by text_mutex, so we have to leave the
> acquire of text_mutex out of larch_insn_text_copy. But in the entire
> kernel the acquire of mutexes is always after cpus_read_lock(), so we
> cannot put cpus_read_lock() into larch_insn_text_copy() while leaving
> the text_mutex acquire out (or we risk a deadlock due to inconsistent
> lock acquire order). So I'd just submit this as-is to fix the bug and
> leave the posssible refactor for others.
>
> arch/loongarch/kernel/inst.c | 4 ++--
> arch/loongarch/net/bpf_jit.c | 6 ++++++
> 2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index bf037f0c6b26..db236e2b34a7 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -263,14 +263,14 @@ int larch_insn_text_copy(void *dst, void *src, size_t len)
> .dst = dst,
> .src = src,
> .len = len,
> - .cpu = smp_processor_id(),
> + .cpu = raw_smp_processor_id(),
I think we should keep the old function here, because it can help
others to catch new bugs if the code is refactored in future.
Huacai
> };
>
> start = round_down((size_t)dst, PAGE_SIZE);
> end = round_up((size_t)dst + len, PAGE_SIZE);
>
> set_memory_rw(start, (end - start) / PAGE_SIZE);
> - ret = stop_machine(text_copy_cb, ©, cpu_online_mask);
> + ret = stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> set_memory_rox(start, (end - start) / PAGE_SIZE);
>
> return ret;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 3bd89f55960d..e8e0ad34928c 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -1379,9 +1379,11 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> {
> int ret;
>
> + cpus_read_lock();
> mutex_lock(&text_mutex);
> ret = larch_insn_text_copy(dst, src, len);
> mutex_unlock(&text_mutex);
> + cpus_read_unlock();
>
> return ret ? ERR_PTR(-EINVAL) : dst;
> }
> @@ -1429,10 +1431,12 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
> if (ret)
> return ret;
>
> + cpus_read_lock();
> mutex_lock(&text_mutex);
> if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> mutex_unlock(&text_mutex);
> + cpus_read_unlock();
>
> return ret;
> }
> @@ -1450,10 +1454,12 @@ int bpf_arch_text_invalidate(void *dst, size_t len)
> for (i = 0; i < (len / sizeof(u32)); i++)
> inst[i] = INSN_BREAK;
>
> + cpus_read_lock();
> mutex_lock(&text_mutex);
> if (larch_insn_text_copy(dst, inst, len))
> ret = -EINVAL;
> mutex_unlock(&text_mutex);
> + cpus_read_unlock();
>
> kvfree(inst);
>
> --
> 2.53.0
>