Re: BUG_ON(!newowner) in fixup_pi_state_owner()

From: Mike Galbraith
Date: Wed Nov 04 2020 - 02:42:59 EST


On Wed, 2020-11-04 at 01:56 +0100, Mike Galbraith wrote:
> On Tue, 2020-11-03 at 17:31 -0600, Gratian Crisan wrote:
> > Hi all,
> >
> > I apologize for waking up the futex demons (and replying to my own
> > email), but ...
> >
> > Gratian Crisan writes:
> > >
> > > Brandon and I have been debugging a nasty race that leads to
> > > BUG_ON(!newowner) in fixup_pi_state_owner() in futex.c. So far
> > > we've only been able to reproduce the issue on 4.9.y-rt kernels.
> > > We are still testing if this is a problem for later RT branches.
> >
> > I was able to reproduce the BUG_ON(!newowner) in fixup_pi_state_owner()
> > with a 5.10.0-rc1-rt1 kernel (currently testing 5.10.0-rc2-rt4).
>
> My box says it's generic.

---
kernel/futex.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2383,7 +2383,18 @@ static int fixup_pi_state_owner(u32 __us
* Since we just failed the trylock; there must be an owner.
*/
newowner = rt_mutex_owner(&pi_state->pi_mutex);
- BUG_ON(!newowner);
+
+ /*
+ * Why? Because I know what I'm doing with these beasts? Nope,
+ * but what the hell, a busy restart loop let f_boosted become
+ * owner, so go for it. Box still boots, works, no longer makes
+ * boom with fbomb_v2, and as an added bonus, didn't even blow
+ * futextests all up. Maybe it'll help... or not, we'll see.
+ */
+ if (unlikely(!newowner)) {
+ err = -EAGAIN;
+ goto handle_err;
+ }
} else {
WARN_ON_ONCE(argowner != current);
if (oldowner == current) {