Re: [PATCH] rcu-tasks: Directly invoke rcuwait_wake_up() in call_rcu_tasks_generic()
From: Joel Fernandes
Date: Thu Feb 23 2023 - 22:12:02 EST
On Thu, Feb 23, 2023 at 10:05 PM Zhang, Qiang1 <qiang1.zhang@xxxxxxxxx> wrote:
>
> On Thu, Feb 23, 2023 at 9:35 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Feb 23, 2023 at 9:25 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, Feb 24, 2023 at 12:36:05AM +0000, Zhang, Qiang1 wrote:
> > > > On Thu, Feb 23, 2023 at 08:43:05AM +0000, Zhang, Qiang1 wrote:
> > > > > > From: Zqiang <qiang1.zhang@xxxxxxxxx>
> > > > > > Sent: Thursday, February 23, 2023 2:30 PM
> > > > > > To: paulmck@xxxxxxxxxx; frederic@xxxxxxxxxx; quic_neeraju@xxxxxxxxxxx;
> > > > > > joel@xxxxxxxxxxxxxxxxx
> > > > > > Cc: rcu@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> > > > > > Subject: [PATCH] rcu-tasks: Directly invoke rcuwait_wake_up() in
> > > > > > call_rcu_tasks_generic()
> > > > > >
> > > > > > According to commit '3063b33a347c ("Avoid raw-spinlocked wakeups from
> > > > > > call_rcu_tasks_generic()")', the grace-period kthread is delayed to wakeup
> > > > > > using irq_work_queue() is because if the caller of
> > > > > > call_rcu_tasks_generic() holds a raw spinlock, when the kernel is built with
> > > > > > CONFIG_PROVE_RAW_LOCK_NESTING=y, due to a spinlock will be hold in
> > > > > > wake_up(), so the lockdep splats will happen. but now using
> > > > > > rcuwait_wake_up() to wakeup grace-period kthread instead of wake_up(), in
> > > > > > rcuwait_wake_up() no spinlock will be acquired, so this commit remove using
> > > > > >
> > > > > >There are still spinlock-acquisition and spinlock-release invocations within the call path from rcuwait_wake_up().
> > > > > >
> > > > > >rcuwait_wake_up() -> wake_up_process() -> try_to_wake_up(), then:
> > > > > >
> > > > > > raw_spin_lock_irqsave()
> > > > > > ...
> > > > > > raw_spin_unlock_irqrestore
> > > > >
> > > > > Yes, but this is raw_spinlock acquisition and release(note: spinlock will convert to
> > > > > sleepable lock in Preempt-RT kernel, but raw spinlock is not change).
> > > > >
> > > > > acquire raw_spinlock -> acquire spinlock will trigger lockdep warning.
> > > > >
> > > > >Is this really safe in the long run though? I seem to remember there are
> > > > >weird locking dependencies if RCU is used from within the scheduler [1].
> > > > >
> > > >
> > > >
> > > > I have been running rcutorture with rcutorture.type = tasks-tracing,
> > > > so far no problems have been found.
> > > >
> > > >
> > > > >I prefer to keep it as irq_work_queue() unless you are seeing some benefit.
> > > > >Generally, there has to be a 'win' or other justification for adding more
> > > > >risk.
> > > > >
> > > > >thanks,
> > > > >
> > > > >- Joel
> > > > >[1] http://www.joelfernandes.org/rcu/scheduler/locking/2019/09/02/rcu-schedlocks.html
> > > >
> > > >
> > > > The problem in this link, in an earlier RCU version, rcu_read_unlock_special()
> > > > Invoke wakeup and enter scheduler can lead to deadlock, but my modification is for
> > > > call_rcu_tasks_generic(), even if there is a lock dependency problem, we should pay
> > > > more attention to rcu_read_unlock_trace_special()
> > >
> > > Consider ABBA deadlocks as well, not just self-deadlocks (which IIRC is what
> > > the straight-RCU rcu_read_unlock() issues were about).
> > >
> > > What prevents the following scenario?
> > >
> > > In the scheduler you have code like this:
> > > rq = task_rq_lock(p, &rf);
> > > trace_sched_wait_task(p);
> > >
> > > Someone can hook up a BPF program to that tracepoint that then calls
> > > rcu_read_unlock_trace() -> rcu_read_unlock_trace_special(). All of
> > > this while holding the rq and pi scheduler locks.
> > >
> > > That's A (rq lock) -> B (rtpcp lock).
>
> In rcu_read_unlock_trace_special(), the premise of acquiring the rtpcp lock is that
> before that, we have task switch in the rcu_read_lock_trace/unlock_trace critical section.
> but after we already hold the rq lock, no task switch is generated in the
> rcu_read_lock_trace/unlock_trace critical section.
>
> Please correct me if my understanding is wrong.
Yes, but in the next reply I corrected myself and I am still concerned
about ABBA. There is obviously *some lock* that is held by the callers
of call_rcu_tasks*(). So there is a dependency that gets created
between _that_ lock and the rq lock, if you do a wakeup here. And I
am not sure whether that lock is also acquired when the BPF program
runs. If it is, then the BPF programs may hang. It is probably worth
checking with the BPF guys.
More importantly, do you see a benefit with this change in terms of
anything more than deleting a few lines of code? Paul typically favors
robustness and guard rails (as do I), unless there is significant
benefit in performance, power or both.
Thanks,
- Joel