Re: INFO: rcu detected stall in ndisc_alloc_skb

From: Dmitry Vyukov
Date: Sun Jan 06 2019 - 08:24:47 EST


On Sat, Jan 5, 2019 at 11:49 AM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On 2019/01/03 2:06, Tetsuo Handa wrote:
> > On 2018/12/31 17:24, Dmitry Vyukov wrote:
> >>>> Since this involves OOMs and looks like a one-off induced memory corruption:
> >>>>
> >>>> #syz dup: kernel panic: corrupted stack end in wb_workfn
> >>>>
> >>>
> >>> Why?
> >>>
> >>> RCU stall in this case is likely to be latency caused by flooding of printk().
> >>
> >> Just a hypothesis. OOMs lead to arbitrary memory corruptions, so can
> >> cause stalls as well. But can be what you said too. I just thought
> >> that cleaner dashboard is more useful than a large assorted pile of
> >> crashes. If you think it's actionable in some way, feel free to undup.
> >>
> >
> > We don't know why bpf tree is hitting this problem.
> > Let's continue monitoring this problem.
> >
> > #syz undup
> >
>
> A report at 2019/01/05 10:08 from "no output from test machine (2)"
> ( https://syzkaller.appspot.com/text?tag=CrashLog&x=1700726f400000 )
> says that there are flood of memory allocation failure messages.
> Since continuous memory allocation failure messages itself is not
> recognized as a crash, we might be misunderstanding that this problem
> is not occurring recently. It will be nice if we can run testcases
> which are executed on bpf-next tree.

What exactly do you mean by running test cases on bpf-next tree?
syzbot tests bpf-next, so it executes lots of test cases on that tree.
One can also ask for patch testing on bpf-next tree to test a specific
test case.