Re: [RFC] syzbot process

From: Dmitry Vyukov
Date: Thu Dec 28 2017 - 06:45:38 EST


On Thu, Dec 28, 2017 at 11:51 AM, Ozgur <ozgur@xxxxxxxxxx> wrote:
>
>
> 28.12.2017, 13:41, "Dmitry Vyukov" <dvyukov@xxxxxxxxxx>:
>> On Fri, Dec 22, 2017 at 4:32 AM, Eric Biggers <ebiggers3@xxxxxxxxx> wrote:
>>> On Thu, Dec 21, 2017 at 01:52:40PM +0100, Dmitry Vyukov wrote:
>>>> However, the cost is that it needs to understand statuses of bugs:
>>>> most importantly, what commit fixes what bug. It also has support for
>>>> marking a bug as "invalid", e.g. happened once but most likely was
>>>> caused by a previous silent memory corruption. And support for marking
>>>> bugs as duplicates of other bugs, i.e. the same root cause and will be
>>>> fixed when the target bug is fixed. These simple rules are outlined in
>>>> the footer of each report and also explained in more detail at the
>>>> referenced link:
>>>>
>>>> ----------------------------------
>>>> This bug is generated by a dumb bot. It may contain errors.
>>>> See https://goo.gl/tpsmEJ for details.
>>>> Direct all questions to syzkaller@xxxxxxxxxxxxxxxxx
>>>> Please credit me with: Reported-by: syzbot <syzkaller@xxxxxxxxxxxxxxxx>
>>>> syzbot will keep track of this bug report.
>>>> Once a fix for this bug is merged into any tree, reply to this email with:
>>>> #syz fix: exact-commit-title
>>>> If you want to test a patch for this bug, please reply with:
>>>> #syz test: git://repo/address.git branch
>>>> and provide the patch inline or as an attachment.
>>>> To mark this as a duplicate of another syzbot report, please reply with:
>>>> #syz dup: exact-subject-of-another-report
>>>> If it's a one-off invalid bug report, please reply with:
>>>> #syz invalid
>>>> Note: if the crash happens again, it will cause creation of a new bug report.
>>>> Note: all commands must start from beginning of the line in the email body.
>>>> ----------------------------------
>>>>
>>>> Status tracking allows syzbot to (1) keep track of still unfixed bugs
>>>> (more than half actually gets lost in LKML archives if nobody keeps
>>>> track of them), (2) be able to ever report similarly looking crashes
>>>> as new bugs in future, (3) be able to test fixes.
>>>>
>>>> The problem is that these rules are mostly not followed.
>>>
>>> As others mentioned, allowing a bug ID to be in the fix's commit message,
>>> perhaps in the Reported-by line which syzbot already suggests to include, would
>>> make things a bit easier.
>>>
>>> But I think the larger problem is that people in the community don't have any
>>> visibility into the statuses of the bugs, so they don't have any motivation to
>>> manage the statuses.
>>>
>>> Are you planning to make a dashboard app publicly available for upstream kernel
>>> bugs being tracked by syzbot? I think it would be very useful for the
>>> community, especially for finding more details about a bug, e.g. when was it
>>> last seen, how often was it seen, has it been seen in multiple trees. Also for
>>> finding duplicates which may not have been sent to the correct mailing list.
>>
>> Hi Eric,
>>
>> Good question. I would very much like to open the UI, and I hope to do
>> it in near future, but we need to do some additional work to make it
>> possible. The good news is that information is already accumulating
>> and we can do pings, etc.
>
> Hello Dmitry,
>
> I think not useful to be a GUI, for example it can be console based ui we can conenct and get information and fixed patches.

Hi Ozgur,

We will do web UI first as it's something that's already partially
there and syzbot itself is not a console process, it's a cloud
service. It's also handy because there are lots of contextual
information and in a web UI one can just just click links to navigate
or download a blob. Later we could do an API for console clients, etc
if there is an interest in developing these types of UIs. But
generally UI is not the main business of syzbot, it's only a side
thing that helps it achieve the main goal, so it's doesn't have a team
of people assigned to it. But you are welcome to contribute, it's all
open-source:
https://github.com/google/syzkaller/tree/master/dashboard/app


> So syzbot is perfectly, I founded a patc last time :)
>
> https://09738734946362323617.googlegroups.com/attach/3c6ef7059f77c/patch.txt?part=0.2&view=1&vt=ANaJVrFm49WFVkkKiomlnsrdfnv4P-0znjiC4agFB72ibq9_6iqg1rmZtw9-DxS5VvoOoKx8Ikl88sYEQQ45X0vjrwFkKDRaZELV-oU9DVmmrRAMSfStn24
>
> And, I have a my suggestions:
>
> Please keep to short url addresses.

Well, that's an URL generated by google groups, we don't have control
over it. You also received the patch as an attachment in the syzbot
email.


> and I think syzbot use to .txt file attached.
> .txt is not good.

Why are not .txt attachments good? What do you propose to use?

Thanks

>>> syzbot also should be sending out reminders for bugs that are still open if the
>>> crash is still occurring, and even moreso if there is a reproducer.
>>
>> Agree. The reasons why this hasn't happen yet are:
>> 1. syzbot is being built up as it's running, I am overwhelmed with
>> hundreds of bugs and also doing lots of work which may be not directly
>> visible but important (e.g. improving quality of generated
>> reproducers, increasing percent of cases when reproducers are created,
>> improving bug title extraction logic, implementing patch testing by
>> request, now this new Reported-by-based process, etc).
>> 2. Just sending an email for each open bug every week is simple, but I
>> afraid it won't be warmly welcomed. The open questions are: how
>> frequently syzbot should ping? should repro/no repro affect this? what
>> to do if it stopped happening? stopped happenning for how long? and
>> what if it happened just few times, so we can't really conclude if it
>> still happens or not (but we've seen very bad races manifesting this
>> way)? how should it interact with the following point?
>>
>>> However, if the crash isn't still occurring, then I expect it will become
>>> necessary to automatically invalidate the bug after some time, lest the list of
>>> bugs grow without bound due to bugs that have already been fixed that no one has
>>> time to debug to figure out exactly when/what the fix was, especially if there
>>> is no reproducer. Or perhaps the bug was only in linux-next and only existed
>>> due to a buggy patch which was dropped or modified before it reached mainline,
>>> so there is no "fix" commit.
>>
>> Good point. I think we will need to do this in some form in future.
>> Again open questions:
>> - what is the precise formula behind "isn't still occurring"?
>> - should we only close "no repro" bugs?
>> - should we re-test bugs with repro? (re-testing is not 100% precise,
>> so we will lose some real subtle bugs this way)
>>
>> Thanks