Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

From: Mel Gorman
Date: Tue Aug 08 2017 - 12:44:31 EST


On Tue, Aug 08, 2017 at 09:06:48AM -0700, Linus Torvalds wrote:
> On Tue, Aug 8, 2017 at 8:41 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > With my __BUG_FLAGS() issue corrected, the WARN_ON_ONCE() fires once,
> > and everything else seems fine. I'll have a go with additional debug
> > enabled just in case.
>
> Ok, great, a - mostly - false alarm.
>
> I do wonder if we should just remove even that WARN_ON_ONCE() - I
> think it was added to be careful, and the code seems to do the right
> thing.
>

Exactly. I didn't really expect an application to behave like this that
would create a storm of warning-related bug reports and even one that
did should fail in userspace anyway in some fashion (e.g. lost wakup or
unexpected errno). It looks like the reproduction case is replacing the
mapping so it should be safe to remove the warning because enough time
has passed that any other "interesting" case should have triggered by now.

If Mark confirms that removing the warning is ok for his test case, I'll
send a patch to Thomas with a tag for stable and it should arrive at your
inbox eventually. If I don't hear from Mark, I'll have time to try the
test case in the morning and go from there.

> The second WARN_ON_ONCE() (that is marked as "should be impossible")
> we might as well leave around. If that one triggers, it's a lot more
> interesting.
>

Agreed.

> Mel? No hurry - the nice thing about WARN_ON_ONCE() is that it's just
> a single note so it's neither killing the machine(*) nor causing any
> real problems.
>

That was the intent -- "this is recoverable but I am interested in
hearing if this ever occurs without truncation or unmap being involved".
Assuming no other surprises, it'll be remove relatively shortly.

--
Mel Gorman
SUSE Labs