Re: WARNING in usbhid_raw_request/usb_submit_urb (2)

From: Andrey Konovalov
Date: Tue Jan 07 2020 - 09:28:44 EST


On Fri, Jan 3, 2020 at 6:01 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, 3 Jan 2020, syzbot wrote:
>
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger
> > crash:
> >
> > Reported-and-tested-by:
> > syzbot+10e5f68920f13587ab12@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > Tested on:
> >
> > commit: ecdf2214 usb: gadget: add raw-gadget interface
> > git tree: https://github.com/google/kasan.git
> > kernel config: https://syzkaller.appspot.com/x/.config?x=b06a019075333661
> > dashboard link: https://syzkaller.appspot.com/bug?extid=10e5f68920f13587ab12
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > patch: https://syzkaller.appspot.com/x/patch.diff?x=177f06e1e00000
> >
> > Note: testing is done by a robot and is best-effort only.
>
> Andrey:
>
> Clearly something strange is going on here. First, the patch should
> not have changed the behavior; all it did was add some log messages.
> Second, I don't see how the warning could have been triggered at all --
> it seems to be complaining that 2 != 2.

Hi Alan,

It looks like some kind of race in involved here.

There are a few indications of that: 1. there's no C reproducer
generated for this crash (usually happens because of timing
differences when executing syz repro vs C repro), 2. syz repro has
threaded, collide and repeat flags turned on (which means it gets
executed many times with some syscalls scheduled asynchronously).

This also explains the weirdness around the 2 != 2 check being failed.
First the comparison failed, then another thread updated one of the
numbers being compared, and then the printk statement got executed.

>
> Does the reproducer really work?

Yes, it worked for syzbot at the very least. It looks like your patch
introduced some delays which made the bug untriggerable by the same
reproducer. Since this is a race it might be quite difficult to
reproduce this manually (due to timing differences caused by a
different environment setup) as well unfortunately.

Perhaps giving a less invasive patch (that minimizes timing changes
introduced to the code that is suspected of being racy) to syzbot
could be used to debug this.

Thanks!