syzbot bisection analysis

From: Dmitry Vyukov
Date: Wed Mar 27 2019 - 13:20:20 EST


Hello,

As most of you probably already noticed, syzbot started bisecting
cause commits for crashes about 2 weeks ago and sending emails like
this:
https://groups.google.com/d/msg/syzkaller-bugs/2XhfN2Kfbqs/0U3YnKsGBQAJ
The bisection results are also available on the dashboard, e.g.:
https://syzkaller.appspot.com/bug?id=02bde0600a225e8efa31bdce2e7f1b822542fef1

Bisection was probably the most popular feature request for syzbot.
Cause commits allow to add the right people to CC and also should help
to pin-point the harder bugs. If you are interested in details of the
bisection process, some are described here:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md#bisection
The next step step will be fix commit bisection to help identify and
close bugs that are already fixed but syzbot is not aware yet.

As expected automatic bisection of kernel bugs is not completely
trivial and we've got lots of incorrect results. To better understand
what happens, why and how we are doing, I've analyzed the 118
bisections that we have so far for the following metrics:
- if the bisection was correct or not
- the crash has multiple manifestations (on the same commit or on
different commits)
- if the fact that bug hard to reproduce contributed to incorrect bisection
- if unrelated bugs contributed to incorrect bisection
- if skipped commits contributed to incorrect bisection
- if disabled configs contributed to incorrect bisection
There are also some auto-extracted metrics like the start release of
bisection, start/end crash, etc. I won't claim that the analysis is
100% correct, which would require spending a day on each case. But it
should be 95% correct or so. The results are here (there is a second
tab with raw data):
https://docs.google.com/spreadsheets/d/1WdBAN54-csaZpD3LgmTcIMR7NDFuQoOZZqPZ-CUqQgA

Total success rate is slightly above 50%. But there is strong
correlation with how far back in history we have to go: for recently
introduced bugs the rate is 70+%. And for bugs introduced since v5.0
it's 95%. So hopefully this is a good forecast for future.

The 2 major contributors to incorrect results look quite fundamental:
- unrelated bugs contributed to 66% of incorrect results
- hard to reproduce bugs contributed to 46% of incorrect results

I've started collecting feedback/ideas re improving bisection quality here:
https://github.com/google/syzkaller/issues/1051
But so far no magic bullet come up. So please continue treating the
results with understanding. The incorrect results were usually easy to
identify: commit to a completely unrelated subsystem, or even
non-current arch. There is always a detailed bisection log attached as
well.

If you are still here, there were some curious cases too, e.g.:
A bug bisected to a comment-only commit:
https://groups.google.com/d/msg/syzkaller-bugs/1BSkmb_fawo/vz7GhBd0CQAJ
A bug bisected to a release tag:
https://groups.google.com/d/msg/syzkaller-bugs/38HP_pUXJ3s/ehD37HSxDAAJ
And a fault-injection-provoked bug bisected to addition of the fault
injection facility by me (which is, well, kinda expected):
https://groups.google.com/d/msg/syzkaller-bugs/GYiA5CKTPXw/MA4mO01wDAAJ

Thanks