Re: [RFC PATCH 1/2] locking: add mutex_lock_nospin()

From: Peter Zijlstra

Date: Wed Mar 04 2026 - 07:42:11 EST

On Wed, Mar 04, 2026 at 07:52:06PM +0800, Yafang Shao wrote:
> On Wed, Mar 4, 2026 at 6:11 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Mar 04, 2026 at 05:37:31PM +0800, Yafang Shao wrote:
> > > On Wed, Mar 4, 2026 at 5:03 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Mar 04, 2026 at 03:46:49PM +0800, Yafang Shao wrote:
> > > > > Introduce mutex_lock_nospin(), a helper that disables optimistic spinning
> > > > > on the owner for specific heavy locks. This prevents long spinning times
> > > > > that can lead to latency spikes for other tasks on the same runqueue.
> > > >
> > > > This makes no sense; spinning stops on need_resched().
> > >
> > > Hello Peter,
> > >
> > > The condition to stop spinning on need_resched() relies on the mutex
> > > owner remaining unchanged. However, when multiple tasks contend for
> > > the same lock, the owner can change frequently. This creates a
> > > potential TOCTOU (Time of Check to Time of Use) issue.
> > >
> > > mutex_optimistic_spin
> > > owner = __mutex_trylock_or_owner(lock);
> > > mutex_spin_on_owner
> > > // the __mutex_owner(lock) might get a new owner.
> > > while (__mutex_owner(lock) == owner)
> > >
> >
> > How do these new owners become the owner? Are they succeeding the
> > __mutex_trylock() that sits before mutex_optimistic_spin() and
> > effectively starving the spinner?
> >
> > Something like the below would make a difference if that were so.
>
> The following change made no difference; concurrent runs still result
> in prolonged system time.
>
> real 0m5.265s user 0m0.000s sys 0m4.921s
> real 0m5.295s user 0m0.002s sys 0m4.697s
> real 0m5.293s user 0m0.003s sys 0m4.844s
> real 0m5.303s user 0m0.001s sys 0m4.511s
> real 0m5.303s user 0m0.000s sys 0m4.694s
> real 0m5.302s user 0m0.002s sys 0m4.677s
> real 0m5.313s user 0m0.000s sys 0m4.837s
> real 0m5.327s user 0m0.000s sys 0m4.808s
> real 0m5.330s user 0m0.001s sys 0m4.893s
> real 0m5.358s user 0m0.005s sys 0m4.919s
>
> Our kernel is not built with CONFIG_PREEMPT enabled, so prolonged
> system time can lead to CPU pressure and potential latency spikes.
> Since we can reliably reproduce this unnecessary spinning, why not
> improve it to reduce the overhead?

If you cannot explain what the problem is (you haven't), there is
nothing to fix.

Also, current kernels cannot be build without PREEMPT; and if you care
about latency running a PREEMPT=n kernel is daft. That said,
TIF_NEED_RESCHED should work irrespective of PREEMPT settings, the
PREEMPT settings just affect when and how you end up in schedule().

Even without PREEMPT, if there is a task waiting either the wakeup or
the tick will set TIF_NEED_RESCHED and it should stop spinning. If there
is no task waiting, there is no actual latency, just burning CPU time,
and that isn't a problem per-se.

What should happen is that the first spinner gets the lock next, the
next spinner is then promoted to first spinner and so on.

This chain continues, which means the lock owner is always
on-cpu and good progress is being made and there is no CPU contention,
or the spinner gets marked for preemption (as said, this does not
require PREEMPT=y) and will stop spinning and go sleep, or the owner
goes to sleep and all the spinners stop and also go sleep.

Again, you have not said anything specific enough to figure out what
happens on your end. You said the owner changes, this means there is
progress made. What isn't clear is if any one particular spinner is
starved (that would be a problem) or if this latency spike you observe
is worse than would be from running a while(1) loop, in which case,
that's just how it is.

What is not sane, is marking random locks with random properties just
because random workload.