Re: [PATCH 1/1] rcu/tree: Drop the lock before entering to page allocator

From: Joel Fernandes
Date: Thu Jul 16 2020 - 09:36:52 EST


On Thu, Jul 16, 2020 at 11:19:13AM +0200, Uladzislau Rezki wrote:
> On Wed, Jul 15, 2020 at 07:13:33PM -0400, Joel Fernandes wrote:
> > On Wed, Jul 15, 2020 at 2:56 PM Sebastian Andrzej Siewior
> > <bigeasy@xxxxxxxxxxxxx> wrote:
> > >
> > > On 2020-07-15 20:35:37 [+0200], Uladzislau Rezki (Sony) wrote:
> > > > @@ -3306,6 +3307,9 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
> > > > if (IS_ENABLED(CONFIG_PREEMPT_RT))
> > > > return false;
> > > >
> > > > + preempt_disable();
> > > > + krc_this_cpu_unlock(*krcp, *flags);
> > >
> > > Now you enter memory allocator with disabled preemption. This isn't any
> > > better but we don't have a warning for this yet.
> > > What happened to the part where I asked for a spinlock_t?
> >
> > Ulad,
> > Wouldn't the replacing of preempt_disable() with migrate_disable()
> > above resolve Sebastian's issue?
> >
> This for regular kernel only. That means that migrate_disable() is
> equal to preempt_disable(). So, no difference.

But this will force preempt_disable() context into the low-level page
allocator on -RT kernels which I believe is not what Sebastian wants. The
whole reason why the spinlock vs raw-spinlock ordering matters is, because on
RT, the spinlock is sleeping. So if you have:

raw_spin_lock(..);
spin_lock(..); <-- can sleep on RT, so Sleep while atomic (SWA) violation.

That's the main reason you are dropping the lock before calling the
allocator.

But if you call preempt_disable() and drop the lock, then that may fix the
lock ordering violation only to reintroduce the SWA issue, which is why you
wanted to drop the lock in the first place.

migrate_disable() on -RT kernels will not disable preemption which is where
Sebastian is coming from I think:
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/tree/kernel/sched/core.c?h=v5.4-rt#n8178

In your defense, the "don't disable preemption on migrate_disable()" on -RT
part is not upstream yet. So maybe this will not work on upstream PREEMPT_RT,
but I'm not sure whether folks are really running upstream PREEMPT_RT without
going through the RT tree.

I did not see the drawback of just keeping it as migrate_disable() which
should make everyone happy but Sebastian may have other opinions such as
maybe that converting the lock to normal spinlock makes the code work on
upstream's version of PREEMPT_RT without requiring the migrate_disable()
change. But he can elaborate more.

Paul, how does all this sound to you?

thanks,

- Joel