Re: LKML admins (syzbot emails are not delivered)
From: Eric W. Biederman
Date: Mon Jan 15 2018 - 11:39:41 EST
Dmitry Vyukov <dvyukov@xxxxxxxxxx> writes:
> On Thu, Jan 4, 2018 at 4:23 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> Dmitry Vyukov <dvyukov@xxxxxxxxxx> writes:
>>
>>> Hi Pavel,
>>>
>>> I've answered this question here in full detail. In short, this is
>>> useful and actionable.
>>> https://groups.google.com/d/msg/syzkaller/2nVn_XkVhEE/GjjfISejCgAJ
>>
>> *Snort*
>>
>> If the information to solve an issue is not in the Oops syzbot is
>> useless.
>
> Hi Eric
>
> That's true. But maintainers of the subsystem is in the best position
> to judge that. For that they need to see the report.
>> Then there is the issue of testing linux-next and reporting errors on
>> who knows what code configuration against code that hasn't changed in
>> linux-next. Which presumably any sane person would assume the errors
>> are introduced by some other piece of new code. But syzbot goes and
>> spams the people who wrote the function where the code is failing.
>
> syzbot uses get_maintainers.pl. If you have better suggestions, I am listening.
> And note: syzbot _always_ provides exact code configuration.
If you are testing linux-next you should really report it to whomevers
branch in linux-next you are testing.
Ideally the tests would be run on mainline and see nothing and then
on linux-next so you know it is a newly introduced error.
Sometimes the branches on linux-next are experimental crap. If someone
adds an experimental memory allocator to linux-next before discovering
it causes all kinds of problems I don't want bug reports about my code
not being able to allocate memory because the memory allocator was bad.
If you don't have the resources to test the individual branches of
linux-next please just test Linus's tree. That will be much more
meaningful and productive.
>> Bots can work. We have all of the automatic testing infrastructure
>> against everyone's branches on kernel.org to prove it.
>
> If you mean build/boot testing, than that's an order of magnitude
> simper problem. You can build on every commit, you can precisely
> pinpoint the guilty commit, etc. Please keep this in mind.
The difference is not what is being tested. The difference is how the
interface to human beings is constructed. If it doesn't feel like
there is a human being willing to work with you on the other end of a
bug report it is not a good situation.
>> syzbot finds weird errors, so that makes the problem space more
>> difficult to deal with.
>
> kernel contains weird errors, that makes the problem space more difficult.
>
>
>> Still I compleltely don't see the people behind syzbot presumably you
>> Dmitry taking responsibility for syzbot failings. Instead I see excuses
>> like you don't completely control some part of the code that syzbot is
>> built on so can't fix practical real world issues. Like Content-type.
>
> As far as I understand you mean this one:
> https://groups.google.com/d/msg/syzkaller/2nVn_XkVhEE/VSZaokajCgAJ
>
> I probably should have described the rationale in more details.
> It's not only about technical limitations. It's also about importance
> of a feature, time required to implement it, and in the end if it's
> the right thing to do at all or not. If that would be a major issue
> that is significantly affects experience, that would happen one way or
> another regardless of technical limitations. Also simple one-line
> changes generally happen even if it's low profit. But in that case, I
> think it's just the wrong thing to do. .txt is good, standard
> extension for text files. On the other hand, .syz is completely
> non-standard that no programs know how to deal with. That's why it did
> not happen.
> The support for Reported-by tags as discussed in "syzbot process"
> thread happened within a week.
When I made the complaint it came to me and to messages on lkml as
.log. With Content-Type: Application/Octent-stream.
That is a bloody mess that wastes peoples time. If it is fixed good,
it certainly was not fixed at that point.
But it is much more than any single issue. You get defensive when
people critisize syzbot. Instead of recognizing it's failings.
It is fine to say I would like to do xyz but it will take awhile before
we can get to it.
> Hope this resolves your concerns.
This email just intensifies them, as it feels again like you are trying
to shift the blame.
>> Bots can be the most horrible thing for a code base. If there is not
>> someone or something going through an filtering out the false positives.
>> If there is not a process to ensure that issues are brought to the
>> proper peoples attention so things get fixed. Bots can be completely
>> demoralizing or possibily desensitizing because you keep seeing issues,
>> and nothing you do ever makes the issues go away.
>>
>> Given that no one seems to take any responsibility for syzbots failures
>> of any kind. Not content-type in the emails. Not the body of the
>> message (which has a massive disclaimer). I don't find syzbot at all
>> useful.
>>
>> Tools are for people, in this case kernel programmers. syzbot has
>> serious usability issues. That makes syzbot a bad tool.
>
> First of all, none of syzbot reports are false positives in the main
> sense of this term.
*Snort* You are testing linux-next. Who knows what pieces of that are
going to go into a stable kernel. You are not reporting failures
against individual maintainers who changed linux-next you are
reporting to random people in get_maintainers.pl who may have no
connection with the code change.
So it is a very low quality bug report, on some random mutation of the
linux-kernel. You might as well be testing some random distro kernel
and using get_maintainers.pl to tell us something is wrong.
> You get the same reports from humans as well. Say, there is an invalid
> free in pcrypt which corrupts memory, but kernel crashes in selinux
> later. You will get report about selinux from a human.
> syzbot actually makes situation a bit better to the degree possible as
> it enables almost all debugging configs. So instead of a random
> corruption reports, it provides a KASAN report about the exact
> location. Instead of a dead kernel, you get LOCKDEP report about exact
> lock inversion, etc.
But I can ask the human what their configuration was and what they were
doing when the error happened. Further things can be prioritized by how
badly the errors affect real people. In practice if bugs don't affect
people more than once, they don't care.
That conversation can not be had with syzbot.
Outside of the bugs being considered as considered as security issues,
the bugs syzbot finds are generally things that don't affect anyone in
practice. So are very low on the priority of things to get fixed.
> Now there are duplicates, induced bugs, unexplainable crashes, reports
> mailed to wrong people, etc.
> There are hundreds of subsystems in kernel. And answering any of these
> questions requires expertise in a particular subsystem. Say, this
> crash is also a possible way how that bug could manifest. Or, the
> crash happened in this subsystem, but the root cause is actually in
> the upper-level subsystem that misuses this subsystem.
> The right people to deal with this are maintainers of particular
> subsystems. Not a single person that does not work on any of these
> hundreds of subsystems.
I am definitely not asking for expertise in the kernel. I am asking
for a human who wants to help track down bugs in the kernel. Not a
poorly backed up accusation that I did something wrong.
Because that is how syzbot feels today.
Eric