Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
From: Catalin Marinas
Date: Mon Sep 01 2025 - 07:29:03 EST
On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
> diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
> index a385603436e9..ce8feadeb9a9 100644
> --- a/arch/arm64/include/asm/rqspinlock.h
> +++ b/arch/arm64/include/asm/rqspinlock.h
> @@ -3,6 +3,9 @@
> #define _ASM_RQSPINLOCK_H
>
> #include <asm/barrier.h>
> +
> +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
More on this below, I don't think we should define it.
> diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
> index 5ab354d55d82..8de1395422e8 100644
> --- a/kernel/bpf/rqspinlock.c
> +++ b/kernel/bpf/rqspinlock.c
> @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
> u64 duration;
> u64 cur;
> u16 spin;
> + u8 wait;
> };
>
> #define RES_TIMEOUT_VAL 2
> @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
> }
>
> /*
> - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
> - * as the macro does internal amortization for us.
> + * Only amortize with spins when we don't have a waiting implementation.
> */
> -#ifndef res_smp_cond_load_acquire
> #define RES_CHECK_TIMEOUT(ts, ret, mask) \
> ({ \
> - if (!(ts).spin++) \
> + if ((ts).wait || !(ts).spin++) \
> (ret) = check_timeout((lock), (mask), &(ts)); \
> (ret); \
> })
> -#else
> -#define RES_CHECK_TIMEOUT(ts, ret, mask) \
> - ({ (ret) = check_timeout((lock), (mask), &(ts)); })
> -#endif
IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
doesn't amortise the spins, as the comment suggests, but rather the
calls to check_timeout(). This is fine, it matches the behaviour of
smp_cond_load_relaxed_timewait() you introduced in the first patch. The
only difference is the number of spins - 200 (matching poll_idle) vs 64K
above. Does 200 work for the above?
> /*
> * Initialize the 'spin' member.
> * Set spin member to 0 to trigger AA/ABBA checks immediately.
> */
> -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
> +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
First of all, I don't really like the smp_cond_load_acquire_waiting(),
that's an implementation detail of smp_cond_load_*_timewait() that
shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
also used outside the smp_cond_load_acquire_timewait() condition. The
(ts).wait check only makes sense when used together with the WFE
waiting.
I would leave RES_CHECK_TIMEOUT() as is for the stand-alone cases and
just use check_timeout() in the smp_cond_load_acquire_timewait()
scenarios. I would also drop the res_smp_cond_load_acquire() macro since
you now defined smp_cond_load_acquire_timewait() generically and can be
used directly.
--
Catalin