RE: BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"

From: Paul Durrant
Date: Thu Mar 02 2017 - 07:25:29 EST


> -----Original Message-----
> From: Juergen Gross [mailto:jgross@xxxxxxxx]
> Sent: 02 March 2017 12:13
> To: Wei Liu <wei.liu2@xxxxxxxxxx>
> Cc: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>; xen-devel <xen-
> devel@xxxxxxxxxxxxxxxxxxxx>; Linux Kernel Mailing List <linux-
> kernel@xxxxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; Boris Ostrovsky
> <boris.ostrovsky@xxxxxxxxxx>; David Miller <davem@xxxxxxxxxxxxx>; Paul
> Durrant <Paul.Durrant@xxxxxxxxxx>
> Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus
> disconnect"
>
> On 02/03/17 13:06, Wei Liu wrote:
> > On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote:
> >> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing:
> >>
> >> [ 174.512861] switch: port 2(vif3.0) entered disabled state
> >> [ 174.522735] BUG: sleeping function called from invalid context at
> >> /home/build/linux-linus/mm/vmalloc.c:1441
> >> [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch
> >> [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: G W
> >> 4.10.0upstream-11073-g4977ab6-dirty #1
> >> [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS
> V17.0
> >> 03/14/2011
> >> [ 174.525517] Call Trace:
> >> [ 174.526217] show_stack+0x23/0x60
> >> [ 174.526899] dump_stack+0x5b/0x88
> >> [ 174.527562] ___might_sleep+0xde/0x130
> >> [ 174.528208] __might_sleep+0x35/0xa0
> >> [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20
> >> [ 174.529463] ? __wake_up+0x40/0x50
> >> [ 174.530089] remove_vm_area+0x20/0x90
> >> [ 174.530724] __vunmap+0x1d/0xc0
> >> [ 174.531346] ? delete_object_full+0x13/0x20
> >> [ 174.531973] vfree+0x40/0x80
> >> [ 174.532594] set_backend_state+0x18a/0xa90
> >> [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430
> >> [ 174.533850] ? kfree+0x5b/0xc0
> >> [ 174.534476] ? xenbus_read+0x3d/0x50
> >> [ 174.535101] ? xenbus_read+0x3d/0x50
> >> [ 174.535718] ? xenbus_gather+0x31/0x90
> >> [ 174.536332] ? ___might_sleep+0xf6/0x130
> >> [ 174.536945] frontend_changed+0x6b/0xd0
> >> [ 174.537565] xenbus_otherend_changed+0x7d/0x80
> >> [ 174.538185] frontend_changed+0x12/0x20
> >> [ 174.538803] xenwatch_thread+0x74/0x110
> >> [ 174.539417] ? woken_wake_function+0x20/0x20
> >> [ 174.540049] kthread+0xe5/0x120
> >> [ 174.540663] ? xenbus_printf+0x50/0x50
> >> [ 174.541278] ? __kthread_init_worker+0x40/0x40
> >> [ 174.541898] ret_from_fork+0x21/0x2c
> >> [ 174.548635] switch: port 2(vif3.0) entered disabled state
> >>
> >> I believe calling vfree() when holding a spin_lock isn't a good idea.
> >>
> >
> > Use vfree_atomic instead?
>
> Hmm, isn't this overkill here?
>
> You can just set a local variable with the address and do vfree() after
> releasing the lock.
>

Yep, that's what I was thinking. Patch coming shortly.

Paul

>
> Juergen