Re: [syzbot] general protection fault in __device_attach

From: Greg KH
Date: Fri Jun 03 2022 - 12:12:10 EST


On Fri, Jun 03, 2022 at 12:03:32PM -0400, Alan Stern wrote:
> On Fri, Jun 03, 2022 at 05:52:38PM +0200, Greg KH wrote:
> > On Fri, Jun 03, 2022 at 11:42:19AM -0400, Alan Stern wrote:
> > > On Fri, Jun 03, 2022 at 02:04:04PM +0300, Andy Shevchenko wrote:
> > > > On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> > > > > syzbot has bisected this issue to:
> > > > >
> > > > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > > > > Author: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> > > > > Date: Fri Jun 18 13:41:27 2021 +0000
> > > > >
> > > > > ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> > > >
> > > > Hmm... It's not obvious at all how this change can alter the behaviour so
> > > > drastically. device_add() is called from USB core with intf->dev.name == NULL
> > > > by some reason. A-ha, seems like fault injector, which looks like
> > > >
> > > > dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> > > > dev->devpath, configuration, ifnum);
> > > >
> > > > missed the return code check.
> > > >
> > > > But I'm not familiar with that code at all, adding Linux USB ML and Alan.
> > >
> > > I can't see any connection between this bug and acpi/sysfs.c. Is it a
> > > bad bisection?
> > >
> > > It looks like you're right about dev_set_name() failing. In fact, the
> > > kernel appears to be littered with calls to that routine which do not
> > > check the return code (the entire subtree below drivers/usb/ contains
> > > only _one_ call that does check the return code!). The function doesn't
> > > have any __must_check annotation, and its kerneldoc doesn't mention the
> > > return code or the possibility of a failure.
> > >
> > > Apparently the assumption is that if dev_set_name() fails then
> > > device_add() later on will also fail, and the problem will be detected
> > > then.
> > >
> > > So now what should happen when device_add() for an interface fails in
> > > usb_set_configuration()?
> >
> > But how can that really fail on a real system?
> >
> > Is this just due to error-injection stuff? If so, I'm really loath to
> > rework the world for something that can never happen in real life.
> >
> > Or is this a real syzbot-found-with-reproducer issue?
>
> Aren't there quite a few reasons why device_add() might fail? (Although
> most of them probably are memory allocation errors...)

I was thinking of the dev_set_name() issue further back in the call
chain.

> Basically, you have to make up your mind. If a function can fail, you
> should be prepared to handle the failure. If it can't fail, there's no
> point in even checking the return code.

True, ok, we should unwind the mess. I'll try to look at it after the
merge window...

But again, is this a "real and able to be triggered from userspace"
problem, or just fault-injection-induced?

thanks,

greg k-h