Re: INFO: task hung in __floppy_read_block_0

From: Lukas Bulwahn
Date: Fri Jul 29 2022 - 10:32:46 EST


On Thu, Jul 28, 2022 at 10:20 PM Dipanjan Das
<mail.dipanjan.das@xxxxxxxxx> wrote:
>
> On Thu, Jul 28, 2022 at 7:23 AM Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> wrote:
> >
> > Dipanjan, are you really sure that you want to report a "INFO: task
> > hung" bug identified with your syzkaller instance? Especially for a
> > floppy driver, probably in your case even just an emulated one
> > (right?). Reading data from floppies was always very slow as far as I
> > remember those times...
>
> From the bugs reported by syzkaller in the past, we observed that
> several of these “INFO: task hung in… “ reports were considered and
> acted on, for example, this:
> https://groups.google.com/g/syzkaller-bugs/c/L0SBaHZ5bYc. For the
> reported issue, we noticed the read task stays blocked for 143
> seconds, which seemed to be one the higher, especially given that it
> is an emulated floppy drive (yes, you are right). If it deems normal,
> then we do apologize for our misassesment.
>

Maybe, some of the "INFO: task hung" reports are considered once in a
while, but from the testing with fuzzing, they are often really
difficult to judge. Was the system first put into a strange state and
then the system was made slow/hanging by that setup?
Often, human users would never do that or if they do, they basically
would need to expect that the system slows down. So, these reports are
generally more difficult to consider valid. I cannot tell you if that
happens in this case, too. Certainly the floppy driver is special by
now, and I would not expect much bug investigation and fixing for
that.

If Dmitry and his team have not answered some of the questions below
and you are coming from an academic background, you might really want
to look into, which may help you in your interest in working on
syzkaller improvements and considering reporting to kernel developers:

We already have https://syzkaller.appspot.com/upstream to track and
report various issues identified by syzkaller.

At this syzbot instance, as of writing, we currently have 976 issues
open, 3904 fixed, 8461 considered invalid.

The bugs are of different types, e.g., BUG: ..., general protection
fault, INFO: ..., KASAN: ..., KMSAN: ..., memory leak: ..., possible
deadlock: ..., UBSAN: ...

So, from the current data, how many bugs of each type were actively
fixed (so, a dedicated commit to repair the code), not just a report
that was closed because it eventually disappeared? How many bugs of
each type are still open? How long does it take from first reporting
to the commit being accepted? Again, e.g., aggregated by type?

That can tell which type of bugs really are addressed more than
others. And that may help you to decide if to report a bug from your
syzkaller instance.

> > Consider the severity of the issue and judge if you would like to
> > point out such a 'bug'.
> >
> > It might happen that:
> >
> > Due to bad judgement on your side, kernel developers and maintainers
> > will consider the value/severity of the provided bug reports overall
> > and then eventually simply ignore all reports that you send.
>
> That would be very unfortunate. Please allow me to explain how we, as
> a *small* academic team, are operating. If you closely follow our
> reportings we did in the last few days, the first “quality control” we
> are doing (to minimize the noise and frustration) is to make sure not
> to report any bug without a reproducer. Now, the unfortunate reality
> is that none of us is a pro kernel hacker with years of expertise in
> tinkering with Linux internals, which essentially means, no matter how
> hard we try, we cannot simply match up the combined level of expertise
> and competency of the people in these mailing groups. We are using our
> best judgement before reporting these bugs. Admittedly, we may be
> wrong, and we apologize in advance for such mishaps. The developers
> can confirm, or refute the reports (if they can spend a line or two
> why they think something we reported is not a problem, we would be
> grateful). In our defense, what we can say is that, in the last few
> days we responded to the developers who asked us to provide details of
> a bug, or test a patch. In fact, we are still in the process of
> responding to some of them, because being a small team, our turnaround
> time is higher than ideal. To answer you, simply ignoring all the
> reports we send might be too harsh (unfair?) to an academic group
> operating in good faith. Providing us pointers like you did above
> (thanks to Greg for helping us in some other thread), and letting us
> know what we did wrong will help us to align ourselves better with the
> reporting and patching workflow.
>

All good, but probably you need to follow some simple guidelines.

If you find an issue in older LTS kernel releases and not on the
current one, you can bisect the issue with the reproducer, and
identify the commit in which the issue is fixed. Then the usual stable
patch acceptance process works.

If you find an issue on the current kernel release, you can bisect the
issue with the reproducer to the commit that introduced the issue.
That is helpful for pinpointing the issue and creating a fix.

Do not report more issues than you can handle when testing suggestions
or writing responses. No one expects you to report everything you
find. (We know there are 900 bugs open, reported by syzkaller; so we
are not short of bug reports.). However, if you report, you should
really have time to follow up with responses and work in reasonable
time (probably within a few days). If you cannot handle that full
time, one important bug report each week might be okay and help a bit,
rather than automated sending 1000 bug reports and never being
available for questions on those reports.

> > Dmitry and his team around syzkaller and syzbot can give you more
> > insights on learning a good judgement of what to report, how and when.
>
> We would very much appreciate any help (even positive criticism) from
> the community in this regard.
>

I think there is not much documentation available specific to
reporting bugs from syzkaller, but there are a few best practices that
we already know and we really might want to write up here because "I
run some syzkaller instance and just report whatever I find to the
developers" simply does not work (we have seen that in the past
already). This keeps developers busy and does not necessarily get more
bugs or the important bugs fixed.

Lukas