Re: [syzbot] KASAN: use-after-free Read in __usb_hcd_giveback_urb (2)

From: Alan Stern
Date: Mon Dec 12 2022 - 10:52:25 EST


On Mon, Dec 12, 2022 at 01:29:24PM +0100, Oliver Neukum wrote:
>
>
> On 08.12.22 18:40, Alan Stern wrote:
> > On Thu, Dec 08, 2022 at 03:36:45PM +0100, Oliver Neukum wrote:
> > > On 06.12.22 16:38, Alan Stern wrote:
>
> > It's hard to tell what's really going on. Looking at
> > xpad_stop_output(), you see that it doesn't do anything if xpad->type is
> > XTYPE_UNKNOWN. Is that what happened here?
>
> The output anchor in xpad was used. So I have to answer that in the negative.
> > I can't figure out where the underlying race is. Maybe it's not
> > directly connected with anchors after all.
> >
> > > As far as I can tell the order we decrease use_count is correct. But:
> > >
> > > 6ec4147e7bdbd (Hans de Goede 2013-10-09 17:01:41 +0200 1674) usb_anchor_resume_wakeups(anchor);
> > > 94dfd7edfd5c9 (Ming Lei 2013-07-03 22:53:07 +0800 1675) atomic_dec(&urb->use_count);
> > >
> > > Do we need to guarantee memory ordering here?
> >
> > I don't think we need to do anything more. usb_kill_urb() is careful to
> > wait for completion handlers to finish, and we already have
>
> By checking use_count
>
> > smp_mb__after_atomic() barriers in the appropriate places to ensure
> > proper memory ordering.
>
> Do we? Looking at __usb_hcd_giveback_urb():
>
> usb_unanchor_urb(urb);
>
> This is an implicit memory barrier
>
> if (likely(status == 0))
> usb_led_activity(USB_LED_EVENT_HOST);
>
> /* pass ownership to the completion handler */
> urb->status = status;
> /*
> * This function can be called in task context inside another remote
> * coverage collection section, but kcov doesn't support that kind of
> * recursion yet. Only collect coverage in softirq context for now.
> */
> kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
> urb->complete(urb);
> kcov_remote_stop_softirq();
>
> usb_anchor_resume_wakeups(anchor);
> atomic_dec(&urb->use_count);
> /*
> * Order the write of urb->use_count above before the read
> * of urb->reject below. Pairs with the memory barriers in
> * usb_kill_urb() and usb_poison_urb().
> */
> smp_mb__after_atomic();
>
> That is the latest time use_count can go to zero.
> But what is the earliest time the CPU could reorder setting use_count to zero?
> Try as I might the last certain memory barrier I can find in this function
> is usb_unanchor_urb().
> That means another CPU can complete usb_kill_urb() before usb_anchor_resume_wakeups()
> runs.
>
> usb_anchor_resume_wakeups(anchor);
>
> I think we need a memory barrier here, too.
>
> atomic_dec(&urb->use_count);

Okay, how about if this is changed to atomic_dec_return? That puts a
full memory barrier both before and after the atomic decrement, so as a
bonus we could remove the smp_mb__after_atomic call.

Alan Stern