Re: futex breakage in 4.9 stable branch

From: Greg Kroah-Hartman
Date: Thu Mar 04 2021 - 08:13:38 EST


On Thu, Mar 04, 2021 at 10:12:56AM +0100, Mike Galbraith wrote:
> On Mon, 2021-03-01 at 18:29 +0100, Ben Hutchings wrote:
> > On Mon, Mar 01, 2021 at 09:07:03AM +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Mar 01, 2021 at 01:13:08AM +0100, Ben Hutchings wrote:
> > > > On Tue, 2021-02-23 at 15:00 +0100, Greg Kroah-Hartman wrote:
> > > > > I'm announcing the release of the 4.9.258 kernel.
> > > > >
> > > > > All users of the 4.9 kernel series must upgrade.
> > > > >
> > > > > The updated 4.9.y git tree can be found at:
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
> > > > > and can be browsed at the normal kernel.org git web browser:
> > > > >
> > > >
> > > > The backported futex fixes are still incomplete/broken in this version.
> > > > If I enable lockdep and run the futex self-tests (from 5.10):
> > > >
> > > > - on 4.9.246, they pass with no lockdep output
> > > > - on 4.9.257 and 4.9.258, they pass but futex_requeue_pi trigers a
> > > > lockdep splat
> > > >
> > > > I have a local branch that essentially updates futex and rtmutex in
> > > > 4.9-stable to match 4.14-stable. With this, the tests pass and lockdep
> > > > is happy.
> > > >
> > > > Unfortunately, that branch has about another 60 commits.
> >
> > I have now rebased that on top of 4.9.258, and there are "only" 39
> > commits.
> >
> > > > Further, the
> > > > more we change futex in 4.9, the more difficult it is going to be to
> > > > update the 4.9-rt branch. But I don't see any better option available
> > > > at the moment.
> > > >
> > > > Thoughts?
> > >
> > > There were some posted futex fixes for 4.9 (and 4.4) on the stable list
> > > that I have not gotten to yet.
> > >
> > > Hopefully after these are merged (this week), these issues will be
> > > resolved.
> >
> > I'm afraid they are not sufficient.
> >
> > > If not, then yes, they need to be fixed and any help you can provide
> > > would be appreciated.
> > >
> > > As for "difficulty", yes, it's rough, but the changes backported were
> > > required, for obvious reasons :(
> >
> > I had another look at the locking bug and I was able to make a series
> > of 7 commits (on top of the 2 already queued) that is sufficient to
> > make lockdep happy. But I am not very confident that there won't be
> > other regressions. I'll send that over shortly.
>
> This is all I had to do to make 4.4-stable a happy camper again.
>
> futex: fix 4.4-stable 34c8e1c2c025 backport inspired lockdep complaint
>
> 1. 34c8e1c2c025 "futex: Provide and use pi_state_update_owner()" was backported
> to stable, leading to the therein assertion that pi_state->pi_mutex.wait_lock
> be held triggering in 4.4-stable. Fixing that leads to lockdep moan part 2.
>
> 2: b4abf91047cf "rtmutex: Make wait_lock irq safe" is absent in 4.4-stable, but
> wake_futex_pi() nonetheless managed to acquire an unbalanced raw_spin_lock()
> raw_spin_inlock_irq() pair, which inspires lockdep to moan after aforementioned
> assert has been appeased.
>
> With this applied, futex tests pass, and no longer inspire lockdep gripeage.
>
> Not-Signed-off-by: Mike Galbraith <efault@xxxxxx>
> ---
> kernel/futex.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -874,8 +874,12 @@ static void free_pi_state(struct futex_p
> * and has cleaned up the pi_state already
> */
> if (pi_state->owner) {
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&pi_state->pi_mutex.wait_lock, flags);
> pi_state_update_owner(pi_state, NULL);
> rt_mutex_proxy_unlock(&pi_state->pi_mutex);
> + raw_spin_unlock_irqrestore(&pi_state->pi_mutex.wait_lock, flags);
> }
>
> if (current->pi_state_cache)
> @@ -1406,7 +1410,7 @@ static int wake_futex_pi(u32 __user *uad
> if (pi_state->owner != current)
> return -EINVAL;
>
> - raw_spin_lock(&pi_state->pi_mutex.wait_lock);
> + raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
> new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
>
> /*
>

Care to sign-off on it so that if this is correct, I can apply it? :)

thanks,

greg k-h