Re: BUG: scheduling while atomic in f_fs when gadget remove driver

From: Felipe Balbi
Date: Tue Sep 27 2016 - 06:01:49 EST



Hi,

Chen Yu <chenyu56@xxxxxxxxxx> writes:
> Hi All,
>
> I'm working on Hikey board based around the HiSilicon Kirin 620, with
> linaro kernel version 4.8.rc1 and I get below BUG error while
> extracting USB cable from PC.

which peripheral controller does this one have? Is it dwc3?

I'm very interested in knowing about throughtput of adb push with dwc3 + f_fs.

Also, do you know if adb can run outside of android environment? I've
been looking for a proper functionfs user for quite some time now :-(

> The funtion using f_fs is adb and usb_gadget_unregister_driver will be
> called after extracting USB cable from PC.
>
> [ 89.456512s][pid:1,cpu1,init]BUG: scheduling while atomic: init/1/0x00000002
> [ 89.456573s]Modules linked in:
> [ 89.456604s]Preemption disabled at:[<ffffffc0006a6dc0>] composite_disconnect+0x30/0xac
> [ 89.456665s][pid:1,cpu1,init]TGID: 1 Comm: init
> [ 89.456695s][pid:1,cpu1,init]Call trace:
> [ 89.456726s][pid:1,cpu1,init][<ffffffc00008a5e0>] dump_backtrace+0x0/0x15c
> [ 89.456756s][pid:1,cpu1,init][<ffffffc00008a75c>] show_stack+0x20/0x28
> [ 89.456756s][pid:1,cpu1,init][<ffffffc001153714>] dump_stack+0x84/0xa8
> [ 89.456787s][pid:1,cpu1,init][<ffffffc0000cfc5c>] __schedule_bug+0x88/0xdc
> [ 89.456817s][pid:1,cpu1,init][<ffffffc00115c4f0>] __schedule+0x714/0x854
> [ 89.456817s][pid:1,cpu1,init][<ffffffc00115c678>] schedule+0x48/0xa4
> [ 89.456817s][pid:1,cpu1,init][<ffffffc00115cbf0>] schedule_preempt_disabled+0x4c/0xf4
> [ 89.456848s][pid:1,cpu1,init][<ffffffc00115ea90>] __mutex_lock_slowpath+0xbc/0x1a4
> [ 89.456878s][pid:1,cpu1,init][<ffffffc00115ebd8>] mutex_lock+0x60/0x64
> [ 89.456878s][pid:1,cpu1,init][<ffffffc0006beb00>] ffs_func_eps_disable.isra.17+0x54/0x114
> [ 89.456909s][pid:1,cpu1,init][<ffffffc0006c05a4>] ffs_func_disable+0x30/0xa0
> [ 89.456909s][pid:1,cpu1,init][<ffffffc0006a6c4c>] reset_config.isra.8+0x44/0x78
> [ 89.456939s][pid:1,cpu1,init][<ffffffc0006a6dd8>] composite_disconnect+0x48/0xac
> [ 89.456939s][pid:1,cpu1,init][<ffffffc0006aafd4>] android_disconnect+0x48/0x54
> [ 89.456970s][pid:1,cpu1,init][<ffffffc0006ad9d0>] usb_gadget_remove_driver+0x58/0xa0
> [ 89.456970s][pid:1,cpu1,init][<ffffffc0006ada90>] usb_gadget_unregister_driver+0x78/0xc4
>
> I checked the codes of composite_disconnect and found
> spin_lock_irqsave called before reset_config in which
> ffs_func_eps_disable is called.
>
> void composite_disconnect(struct usb_gadget *gadget)
> {
> struct usb_composite_dev *cdev = get_gadget_data(gadget);
> unsigned long flags;
>
> /* REVISIT: should we have config and device level
> * disconnect callbacks?
> */
> spin_lock_irqsave(&cdev->lock, flags);
> if (cdev->config)
> reset_config(cdev);
> if (cdev->driver->disconnect)
> cdev->driver->disconnect(cdev);
> spin_unlock_irqrestore(&cdev->lock, flags);
> }
>
> static void ffs_func_eps_disable(struct ffs_function *func)
> {
> struct ffs_ep *ep = func->eps;
> struct ffs_epfile *epfile = func->ffs->epfiles;
> unsigned count = func->ffs->eps_count;
> unsigned long flags;
>
> do {
> if (epfile)
> mutex_lock(&epfile->mutex);
> spin_lock_irqsave(&func->ffs->eps_lock, flags);
> /* pending requests get nuked */
> if (likely(ep->ep))
> usb_ep_disable(ep->ep);
> ++ep;
> spin_unlock_irqrestore(&func->ffs->eps_lock, flags);
>
> if (epfile) {
> epfile->ep = NULL;
> kfree(epfile->read_buffer);
> epfile->read_buffer = NULL;
> mutex_unlock(&epfile->mutex);
> ++epfile;
> }
> } while (--count);
> }
>
> Should the epfile->read_buffer be cleared another place and the
> mutex_lock can be removed in ffs_func_eps_disable?

You are correct. There's a bug there. Can you try to propose a fix for
it?

thanks

--
balbi

Attachment: signature.asc
Description: PGP signature