Re: [PATCH v2 2/5] Introducing qpw_lock() and per-cpu queue & flush work

From: Leonardo Bras

Date: Tue Mar 10 2026 - 20:21:04 EST

On Mon, Mar 09, 2026 at 11:14:23AM +0100, Vlastimil Babka (SUSE) wrote:
> On 3/8/26 19:00, Leonardo Bras wrote:
> > On Tue, Mar 03, 2026 at 01:02:13PM -0300, Marcelo Tosatti wrote:
> >> On Tue, Mar 03, 2026 at 01:03:36PM +0100, Vlastimil Babka (SUSE) wrote:
> >> > On 3/2/26 16:49, Marcelo Tosatti wrote:
> >> > > +#define local_qpw_lock(lock) \
> >> > > + do { \
> >> > > + if (static_branch_maybe(CONFIG_QPW_DEFAULT, &qpw_sl)) { \
> >> > > + migrate_disable(); \
> >> >
> >> > Have you considered using migrate_disable() on PREEMPT_RT and
> >> > preempt_disable() on !PREEMPT_RT since it's cheaper? It's what the pcp
> >> > locking in mm/page_alloc.c does, for that reason. It should reduce the
> >> > overhead with qpw=1 on !PREEMPT_RT.
> >>
> >> migrate_disable:
> >> Patched kernel, CONFIG_QPW=y, qpw=1: 192 cycles
> >>
> >> preempt_disable:
> >> [ 65.497223] kmalloc_bench: Avg cycles per kmalloc: 184 cycles
> >>
> >> I tried it before, but it was crashing for some reason which i didnt
> >> look into (perhaps PREEMPT_RT was enabled).
> >>
> >> Will change this for the next iteration, thanks.
> >>
> >
> > Hi all,
> >
> > That made me remember that rt spinlock already uses migrate_disable and
> > non-rt spinlocks already have preempt_disable()
> >
> > Maybe it's actually worth adding a local_spin_lock() in spinlock{,_rt}.c
> > whichy would get the per-cpu variable inside the preempt/migrate_disable
> > area, and making use of it in qpw code. That way we avoid nesting
> > migtrate_disable or preempt_disable, and further reducing impact.
>
> That would be nice indeed. But since the nested disable/enable cost should
> be low, and the spinlock code rather complicated, it might be tough to sell.
> It would be also great to have those trylocks inline on all arches.

Fair enough.
I will take a look in spinlock code later, maybe we can have one in qpw
code that can be used internally without impacting other users.

>
> > The alternative is to not have migrate/preempt disable here and actually
> > trust the ones inside the locking primitives. Is there a chance of
> > contention, but I don't remember being able to detect it.
>
> So then we could pick the lock on one cpu but then get migrated and actually
> lock it on another cpu. Is contention the only possible downside of this, or
> could it lead to subtle bugs depending on the particular user? The paths
> that don't flush stuff on remote cpus but expect working with the local
> cpu's structure in a fastpath might get broken. I'd be wary of this.

Yeah, that's right. Contention could be really bad for realtime, as rare as
it may happen.

And you are right in potential bugs: for user functions that operate on
local per-cpu data (this_cpu_read/write) it would be expensive to have a
per_cpu_read/write(), so IIRC Marcelo did not convert that in functions
that always run in local_cpu. If the cpu migrates before getting the lock,
we will safely operate remotelly on that cpu data, but any this_cpu_*() in
the function will operate in local cpu instead of remote cpu.

So you and Marcelo are correct: we can't have migrate/preempt happening
during the routine, which means we need them before we get the cpu.

Thanks!
Leo