Re: WARNING in up_write

From: Theodore Y. Ts'o
Date: Thu Apr 05 2018 - 21:37:55 EST


On Thu, Apr 05, 2018 at 05:13:25PM -0700, Eric Biggers wrote:
> Well, ultimately a human needed to investigate the syzbot bug report to figure
> out what was really going on. In my view, the largest problem is that there are
> simply too many bugs, so many are getting ignored. If there were only a few
> bugs, then Dmitry would investigate each one and send a "real" bug report of
> better quality than the automated system can provide, or even send a fix
> directly. But in reality, on the same day this bug was reported, syzbot also
> found 10 other bugs, and in the previous 2 days it had found 38 more. No single
> person can keep up with that. You can see the current bug list, which has 172
> open bugs, on the dashboard at https://syzkaller.appspot.com/. Yes, the kernel
> really is that broken. Though, of course most bugs are in specific modules, not
> the core kernel.

There are a lot of bugs, so it needs to be easier for humans to figure
out which ones they should care about. And not all bugs are created
equal. Some are WARN_ON's that aren't all that important. Others
will hard crash the kernel, but are not likely to be something that
can be turned into a privilege escalation attack. Some bugs are
trivially reproducible, and some take a lot more effort. Making it
easier for humans to decide which ones should be looked at first would
certainly be helpful.y

For me the prioritization goes as follows.

1) Is it a regression? If it's a regression, I want to fix it fast.

2) Is it something that can be easily escalated to a privilege escalation attack?
Again, if so, I want to fix it fast.

3) Is it going to get in the way of my development process? Things
that trigger new xfstests failures are important, because it's how I
detect (1).

So I ignored the Syzkaller reports this week because it's hard to
differentiate important bugs from less important ones, and after the
merge window, I want to make sure that I have not introduced any
regressions, and I also want to make sure that commits getting merged
by others have not introduced any regressions in the testing suite
that I use, which is xfstests.

This is why I've been asking for the bisection feature --- not to find
out when a bug has been fixed, but to find out when a bug has been
*introduced*. If I know that this a bug which has recently
introduced, especially if it has been recently introduced by commits
in my tree, or which I have recently pushed to Linus, I'm going to
care a lot more. If I can't make that determination, I'm going to
deprioritize that bug in favor of those that definitely do meet these
criteria.

It's not a matter of waiting for someone else to fix it (although I
won't complain if someone does :-). It's that I'm overloaded, and I
have to prioritize the work that I do. If syzbot reports are hard to
parse or hard to prioritize, then I may end up prioritizing other work
as being more important. Sorry, but that's just the way that it is.

Note that I haven't just been complaining about it. I've been working
on ways so that the gce-xfstests and kvm-xfstests test appliances can
more easily be used to work on Syzbot reports. If I can make myself
more efficient, or help other people be more efficient, that's
arguably more important than trying to fix some of the 174 currently
open Syzbot issues --- unless you can tell me that certain ones are
super urgent because they (for example) result in CVSS score > 8.

Cheers,

- Ted