RE: [RFC PATCH 47/86] rcu: select PREEMPT_RCU if PREEMPT

From: David Laight
Date: Wed Dec 06 2023 - 05:08:47 EST


...
> > > One thing I would like to look into with the new work is to have holding a
> > > mutex ignore the NEED_RESCHED_LAZY (similar to what is done with spinlock
> > > converted to mutex in the RT kernel). That way you are less likely to be
> > > preempted while holding a mutex.
> >
> > I like the concept, but those with mutex_lock() of rarely-held mutexes
> > in their fastpaths might have workloads that have a contrary opinion.
>
> I don't understand your above statement. Maybe I wasn't clear with my
> statement? The above is more about PREEMPT_FULL, as it currently will
> preempt immediately. My above comment is that we can have an option for
> PREEMPT_FULL where if the scheduler decided to preempt even in a fast path,
> it would at least hold off until there's no mutex held. Who cares if it's a
> fast path when a task needs to give up the CPU for another task?
>
> What I
> worry about is scheduling out while holding a mutex which increases the
> chance of that mutex being contended upon. Which does have drastic impact
> on performance.

Indeed.
You really don't want to preempt with a mutex held if it is about to be
released. Unfortunately this really required a CBU (Crytal Ball Unit).

But I don't think the scheduler timer ticks can be anywhere near frequent
enough to do the 'slightly delayed preemption' that seems to being talked
about - not without a massive overhead.

I can think of two typical uses of a mutex.
One is a short code path that doesn't usually sleep, but calls kmalloc()
so might do so. That could be quite 'hot' and you wouldn't really want
extra 'random' preemption.
The other is long paths that are going to wait for IO, any concurrent
call is expected to sleep. Extra preemption here probably won't matter.

So maybe the software should give the scheduler a hint.
Perhaps as a property of the mutex or the acquire request?
Or feeding the time the mutex has been held into the mix.

It is all a bit like the difference between using spinlock() and
spinlock_irqsave() for a lock that will never be held by an ISR.
If the lock is ever likely to be contended you (IMHO) really want
to use the 'irqsave' version in order to stop the waiting thread
spinning for the duration of the ISR (and following softint).
There is (probably) little point worrying about IRQ latency, a
single readl() to our fpga based PCIe slave will spin a 3GHz cpu
for around 3000 clocks - that it is a lot of normal code.
I've had terrible problems avoiding the extra pthread_mutex() hold
times caused by ISR and softint, they must have the same effect on
spin_lock().

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)