Re: Reminder: 99 open syzbot bugs in net subsystem

From: Eric Biggers
Date: Thu Jul 25 2019 - 11:01:29 EST


On Wed, Jul 24, 2019 at 11:39:13PM -0400, Theodore Y. Ts'o wrote:
> On Wed, Jul 24, 2019 at 01:09:28PM -0700, David Miller wrote:
> > From: Eric Biggers <ebiggers@xxxxxxxxxx>
> > Date: Wed, 24 Jul 2019 11:37:12 -0700
> >
> > > We can argue about what words to use to describe this situation, but
> > > it doesn't change the situation itself.
> >
> > And we should argue about those words because it matters to humans and
> > effects how they feel, and humans ultimately fix these bugs.
> >
> > So please stop with the hyperbole.
>
> Perhaps it would be better to call them, "syzbot reports". Not all
> syzbot reports are bugs. In fact, Dmitry has steadfastly refused to
> add features which any basic bug-tracking system would have, claiming
> that syzbot should not be a bug-tracking system, and something like
> bugzilla should be forcibly imposed on all kernel developers. So I
> don't consider syzkaller reports as bugs --- they are just reports.
>
> In order for developers to want to engage with "syzbot reports", we
> need to reduce developer toil which syzbot imposes on developers, such
> that it is a net benefit, instead of it being just a source of
> annoying e-mails, some of which are actionable, and some of which are
> noise.
>
> In particular, asking developers to figure out which syzbot reports
> should be closed, because developers found the problem independently,
> and fixed it without hearing about from syzbot first, really isn't a
> fair thing to ask. Especially if we can automate away the problem.
>
> If there is a reproducer, it should be possible to automatically
> categorize the reproducer as a reliable reproducer or a flakey one.
> If it is a reliable reproducer on version X, and it fails to be
> reliably reproduce on version X+N, then it should be able to figure
> out that it has been fixed, instead of requesting that a human confirm
> it. If you really want a human to look at it, now that syzkaller has
> a bisection feature, it should be possible to use the reliable
> reproducer to do a negative bisection search to report a candidate
> fix. This would significantly reproduce the developer toil imposed as
> a tax on developers. And if Dmitry doesn't want to auto-close those
> reports that appear to be fixed already, at the very least they should
> be down-prioritized on Eric's reports, so people who don't want to
> waste their time on "bureaucracy" can do so.
>
> Cheers,
>
> - Ted
>
> P.S. Another criteria I'd suggest down-prioritizing on is, "does it
> require root privileges?" After all, since root has so many different
> ways of crashing a system already, and if we're all super-busy, we
> need to prioritize which reports should be addressed first.
>

I agree with all this. Fix bisection would be really useful. I think what we'd
actually need to do to get decent results, though, is consider many different
signals (days since last occurred, repro type, fix bisected, bug bisected,
occurred in mainline or not, does the repro work as root, is it clearly a "bad"
bug like use-after-free, etc.) and compute an appropriate timeout based on that.

However, I'd like to emphasize that in my reminder emails, I've *already*
considered many of these factors when sorting the bug reports, and in particular
the bugs/reports that have been seen recently are strongly weighted towards
being listed first, especially if they were seen in mainline. In this
particular reminder email, for example, the first 18 bugs/reports have *all*
been seen in the last 4 days.

These first 18 bugs/reports are ready to be worked on and fixed now. It's
unclear to me what is most impeding this. Is it part of the syzbot process?
Bad reproducers? Too much noise? Or is it no funding? Not enough qualified
people? No maintainers? Not enough reminders? Lack of CVEs and demonstrable
exploits? What is most impeding these 18 bugs from being fixed?

- Eric