Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore
From: Paul E. McKenney
Date: Fri May 04 2018 - 15:44:06 EST
On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote:
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> writes:
>
> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote:
> >> Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> writes:
> >>
> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote:
> >> >> Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> writes:
> >> >> > From: Anna-Maria Gleixner <anna-maria@xxxxxxxxxxxxx>
> >> > â
> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make >
> >> >> > wait_lock irq safe") for different reason.
> >> >>
> >> >> Which tree has this change been made in? I am not finding the commit
> >> >> you mention above in Linus's tree.
> >> >
> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make
> >> > wait_lock irq safe").
> >>
> >> Can you fix that in your patch description and can you also up the
> >> description of rcu_read_unlock?
> >>
> >> If we don't need to jump through hoops it looks very reasonable to
> >> remove this unnecessary logic. But we should fix the description
> >> in rcu_read_unlock that still says we need these hoops.
> >
> > The hoops are still required for rcu_read_lock(), otherwise you
> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels.
> > What happens with this patch (if I understand it correctly) is that the
> > signal code now uses a different way of jumping through the hoops.
> > But the hoops are still jumped through.
>
> The patch changes:
>
> local_irq_disable();
> rcu_read_lock();
> spin_lock();
> rcu_read_unlock();
>
> to:
>
> rcu_read_lock();
> spin_lock_irq();
> rcu_read_unlock();
>
> Now that I have a chance to relfect on it the fact that the patern
> that is being restored does not work is scary. As the failure has
> nothing to do with lock ordering and people won't realize what is going
> on. Especially since the common rcu modes won't care.
>
> So is it true that taking spin_lock_irq before calling rcu_read_unlock
> is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make
> wait_lock irq safe") actually fixed that and we can correct the
> documentation of rcu_read_unlock() ? And fix __lock_task_sighand?
The problem is that the thing taking the lock might be the scheduler,
or one of the locks taken while the scheduler's pi and rq locks are
held. This occurs only with RCU-preempt.
Here is what can happen:
o A task does rcu_read_lock().
o That task is preempted.
o That task stays preempted for a long time, and is therefore
priority boosted. This boosting involves a high-priority RCU
kthread creating an rt_mutex, pretending that the preempted task
already holds it, and then acquiring it.
o The task awakens, acquires the scheduler's rq lock, and
then does rcu_read_unlock().
o Because the task has been priority boosted, __rcu_read_unlock()
invokes the rcu_read_unlock_special() slowpath, which does
(as you say) rt_mutex_unlock() to deboost. The deboosting
can cause the scheduler to acquire the rq and pi locks, which
results in deadlock.
In contrast, holding these scheduler locks across the entirety of the
RCU-preempt read-side critical section is harmless because then the
critical section cannot be preempted, which means that priority boosting
cannot happen, which means that there will be no need to deboost at
rcu_read_unlock() time.
This restriction has not changed, and as far as I can see is inherent
in the fact that RCU uses the scheduler and the scheduler uses RCU.
There is going to be an odd corner case in there somewhere!
Thanx, Paul