Re: [patch 62/63] locking/rtmutex: Add adaptive spinwait mechanism
From: Peter Zijlstra
Date: Wed Aug 04 2021 - 08:32:17 EST
On Fri, Jul 30, 2021 at 03:51:09PM +0200, Thomas Gleixner wrote:
> From: Steven Rostedt <rostedt@xxxxxxxxxxx>
>
> Going to sleep when a spinlock or rwlock is contended can be quite
> inefficient when the contention time is short and the lock owner is running
> on a different CPU. The MCS mechanism is not applicable to rtmutex based
> locks, so provide a simple adaptive spinwait mechanism for the RT specific
> spin/rwlock implementations.
A better Changelog would explain *why* OSQ does not apply. I'm thinking
this ie because the (spin) waiters can be of different priorities and we
need to ensure the highest prio waiter gets to win?
AFAICT that isn't true even without OSQ, you just get a thundering herd
and the higher prio waiter has a better chance of winning the race but
all bets are off either way around.
> [ tglx: Provide a contemporary changelog ]
It might be best to squash this and the next patch, this back and forth
doesn't make much sense at this point.
> +#ifdef CONFIG_SMP
Existing convention would make that:
#ifdef CONFIG_RTMUTEX_SPIN_ON_OWNER
But I suppose that's indeed not required if we don't use OSQ.
> +/*
> + * Note that owner is a speculative pointer and dereferencing relies
> + * on rcu_read_lock() and the check against the lock owner.
> + */
> +static bool rtlock_adaptive_spinwait(struct rt_mutex_base *lock,
> + struct task_struct *owner)
similarly, this would be:
rt_mutex_spin_on_owner()
> +{
> + bool res = true;
> +
> + rcu_read_lock();
> + for (;;) {
> + /* Owner changed. Trylock again */
> + if (owner != rt_mutex_owner(lock))
> + break;
> + /*
> + * Ensure that owner->on_cpu is dereferenced _after_
> + * checking the above to be valid.
> + */
> + barrier();
> + if (!owner->on_cpu) {
Esp. when this will be on rtmutex unconditionally, you want to mirror
the full set of conditions we also have on mutex_spin_on_owner():
|| need_resched() || vcpu_is_preempted(task_cpu(owner))
> + res = false;
> + break;
> + }
> + cpu_relax();
> + }
> + rcu_read_unlock();
> + return res;
> +}
Additionally, we could consider adding something that would compare the
current prio to the top_waiter prio and terminate the loop if we're
found to be of lower prio, but lifetime issues are probably going to
make that 'interesting'.