Re: INFO: rcu detected stall in ext4_write_checks

From: Theodore Ts'o
Date: Wed Jun 26 2019 - 14:43:19 EST

On Wed, Jun 26, 2019 at 10:27:08AM -0700, syzbot wrote:
> Hello,
> syzbot found the following crash on:
> HEAD commit: abf02e29 Merge tag 'pm-5.2-rc6' of git://
> git tree: upstream
> console output:
> kernel config:
> dashboard link:
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:
> C reproducer:
> The bug was bisected to:
> commit 0c81ea5db25986fb2a704105db454a790c59709c
> Author: Elad Raz <eladr@xxxxxxxxxxxx>
> Date: Fri Oct 28 19:35:58 2016 +0000
> mlxsw: core: Add port type (Eth/IB) set API

Um, so this doesn't pass the laugh test.

> bisection log:

It looks like the automated bisection machinery got confused by two
failures getting triggered by the same repro; the symptoms changed
over time. Initially, the failure was:

crashed: INFO: rcu detected stall in {sys_sendfile64,ext4_file_write_iter}

Later, the failure changed to something completely different, and much
earlier (before the test was even started):

run #5: basic kernel testing failed: failed to copy test binary to VM: failed to run ["scp" "-P" "22" "-F" "/dev/null" "-o" "UserKnownHostsFile=/dev/null" "-o" "BatchMode=yes" "-o" "IdentitiesOnly=yes" "-o" "StrictHostKeyChecking=no" "-o" "ConnectTimeout=10" "-i" "/syzkaller/jobs/linux/workdir/image/key" "/tmp/syz-executor216456474" "root@xxxxxxxxxxxxx:./syz-executor216456474"]: exit status 1
Connection timed out during banner exchange
lost connection

Looks like an opportunity to improve the bisection engine?

- Ted