Re: [PATCH v12 05/15] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait()
From: bot+bpf-ci
Date: Mon Jun 08 2026 - 04:57:59 EST
> diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
> index 9ea0a74e5892..a385603436e9 100644
> --- a/arch/arm64/include/asm/rqspinlock.h
> +++ b/arch/arm64/include/asm/rqspinlock.h
> @@ -3,91 +3,6 @@
> #define _ASM_RQSPINLOCK_H
>
> #include <asm/barrier.h>
> -
> -/*
> - * Hardcode res_smp_cond_load_acquire implementations for arm64 to a custom
> - * version based on [0]. In rqspinlock code, our conditional expression involves
> - * checking the value _and_ additionally a timeout. However, on arm64, the
> - * WFE-based implementation may never spin again if no stores occur to the
> - * locked byte in the lock word. As such, we may be stuck forever if
> - * event-stream based unblocking is not available on the platform for WFE spin
> - * loops (arch_timer_evtstrm_available).
> - *
[ ... ]
> -#endif
> -
> -#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire_timewait(v, c, 0, 1)
> -
> #include <asm-generic/rqspinlock.h>
With res_smp_cond_load_acquire removed here, does arm64 now fall back to
the generic definition in kernel/bpf/rqspinlock.c?
#ifndef res_smp_cond_load_acquire
#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire(v, c)
#endif
On arm64, smp_cond_load_acquire() uses __cmpwait_relaxed(), which issues a
WFE that wakes only on a store to the monitored cache line or on an
event-stream tick.
The consumer is resilient_queued_spin_lock_slowpath():
kernel/bpf/rqspinlock.c:
res_smp_cond_load_acquire(&lock->locked,
!VAL || RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_MASK));
The RES_CHECK_TIMEOUT() escape is part of cond_expr, so it is only
re-evaluated when smp_cond_load_acquire() returns from its WFE.
On platforms where arch_timer_evtstrm_available() is false, if the current
lock owner is stuck or dead and issues no further store to lock->locked,
can the WFE ever wake, and can the RES_CHECK_TIMEOUT timeout ever fire?
This looks like the exact case the removed comment described:
> - * checking the value _and_ additionally a timeout. However, on arm64, the
> - * WFE-based implementation may never spin again if no stores occur to the
> - * locked byte in the lock word. As such, we may be stuck forever if
> - * event-stream based unblocking is not available on the platform for WFE spin
> - * loops (arch_timer_evtstrm_available).
The removed private copy avoided this by falling back to a cpu_relax()-based
spinwait that always re-checks the timeout when the event stream is
unavailable.
Forward search of the series shows this is resolved by the later commit
7abb03c21db2 ("bpf/rqspinlock: Use smp_cond_load_acquire_timeout()"), which
switches rqspinlock to smp_cond_load_acquire_timeout(); on arm64 that uses
a waiting implementation handling the timeout and event-stream-absent case.
Should the change-over happen in the same commit, or is the window where a
kernel built or bisected at this commit can block indefinitely acceptable?
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27125050324