Re: kernel panic: corrupted stack end in wb_workfn

From: Dmitry Vyukov
Date: Wed Mar 20 2019 - 09:57:19 EST


On Wed, Mar 20, 2019 at 2:33 PM Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx> wrote:
>
>
>
> On 3/20/19 1:38 PM, Dmitry Vyukov wrote:
> > On Wed, Mar 20, 2019 at 11:24 AM Tetsuo Handa
> > <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> On 2019/03/20 18:59, Dmitry Vyukov wrote:
> >>>> From bisection log:
> >>>>
> >>>> testing release v4.17
> >>>> testing commit 29dcea88779c856c7dc92040a0c01233263101d4 with gcc (GCC) 8.1.0
> >>>> run #0: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> run #1: crashed: kernel panic: corrupted stack end in worker_thread
> >>>> run #2: crashed: kernel panic: Out of memory and no killable processes...
> >>>> run #3: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> run #4: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> run #5: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> run #6: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> run #7: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> run #8: crashed: kernel panic: Out of memory and no killable processes...
> >>>> run #9: crashed: kernel panic: corrupted stack end in wb_workfn
> >>>> testing release v4.16
> >>>> testing commit 0adb32858b0bddf4ada5f364a84ed60b196dbcda with gcc (GCC) 8.1.0
> >>>> run #0: OK
> >>>> run #1: OK
> >>>> run #2: OK
> >>>> run #3: OK
> >>>> run #4: OK
> >>>> run #5: crashed: kernel panic: Out of memory and no killable processes...
> >>>> run #6: OK
> >>>> run #7: crashed: kernel panic: Out of memory and no killable processes...
> >>>> run #8: OK
> >>>> run #9: OK
> >>>> testing release v4.15
> >>>> testing commit d8a5b80568a9cb66810e75b182018e9edb68e8ff with gcc (GCC) 8.1.0
> >>>> all runs: OK
> >>>> # git bisect start v4.16 v4.15
> >>>>
> >>>> Why bisect started between 4.16 4.15 instead of 4.17 4.16?
> >>>
> >>> Because 4.16 was still crashing and 4.15 was not crashing. 4.15..4.16
> >>> looks like the right range, no?
> >>
> >> No, syzbot should bisect between 4.16 and 4.17 regarding this bug, for
> >> "Stack corruption" can't manifest as "Out of memory and no killable processes".
> >>
> >> "kernel panic: Out of memory and no killable processes..." is completely
> >> unrelated to "kernel panic: corrupted stack end in wb_workfn".
> >
> >
> > Do you think this predicate is possible to code?
>
> Something like bellow probably would work better than current behavior.
>
> For starters, is_duplicates() might just compare 'crash' title with 'target_crash' title and its duplicates titles.

Lots of bugs (half?) manifest differently. On top of this, titles
change as we go back in history. On top of this, if we see a different
bug, it does not mean that the original bug is also not there.
This will sure solve some subset of cases better then the current
logic. But I feel that that subset is smaller then what the current
logic solves.

> syzbot has some knowledge about duplicates with different crash titles when people use "syz dup" command.

This is very limited set of info. And in the end I think we've seen
all bug types being duped on all other bugs types pair-wise, and at
the same time we've seen all bug types being not dups to all other bug
types. So I don't see where this gets us.
And again as we go back in history all these titles change.

> Also it might be worth to experiment with using neural networks to identify duplicates.
>
>
> target_crash = 'kernel panic: corrupted stack end in wb_workfn'
> test commit:
> bad = false;
> skip = true;
> foreach run:
> run_started, crashed, crash := run_repro();
>
> //kernel built, booted, reproducer launched successfully
> if (run_started)
> skip = false;
> if (crashed && is_duplicates(crash, target_crash))
> bad = true;
>
> if (skip)
> git bisect skip;
> else if (bad)
> git bisect bad;
> else
> git bisect good;