Re: BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"

From: Juergen Gross
Date: Thu Mar 02 2017 - 08:05:47 EST


On 02/03/17 13:06, Wei Liu wrote:
> On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote:
>> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing:
>>
>> [ 174.512861] switch: port 2(vif3.0) entered disabled state
>> [ 174.522735] BUG: sleeping function called from invalid context at
>> /home/build/linux-linus/mm/vmalloc.c:1441
>> [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch
>> [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: G W
>> 4.10.0upstream-11073-g4977ab6-dirty #1
>> [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0
>> 03/14/2011
>> [ 174.525517] Call Trace:
>> [ 174.526217] show_stack+0x23/0x60
>> [ 174.526899] dump_stack+0x5b/0x88
>> [ 174.527562] ___might_sleep+0xde/0x130
>> [ 174.528208] __might_sleep+0x35/0xa0
>> [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20
>> [ 174.529463] ? __wake_up+0x40/0x50
>> [ 174.530089] remove_vm_area+0x20/0x90
>> [ 174.530724] __vunmap+0x1d/0xc0
>> [ 174.531346] ? delete_object_full+0x13/0x20
>> [ 174.531973] vfree+0x40/0x80
>> [ 174.532594] set_backend_state+0x18a/0xa90
>> [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430
>> [ 174.533850] ? kfree+0x5b/0xc0
>> [ 174.534476] ? xenbus_read+0x3d/0x50
>> [ 174.535101] ? xenbus_read+0x3d/0x50
>> [ 174.535718] ? xenbus_gather+0x31/0x90
>> [ 174.536332] ? ___might_sleep+0xf6/0x130
>> [ 174.536945] frontend_changed+0x6b/0xd0
>> [ 174.537565] xenbus_otherend_changed+0x7d/0x80
>> [ 174.538185] frontend_changed+0x12/0x20
>> [ 174.538803] xenwatch_thread+0x74/0x110
>> [ 174.539417] ? woken_wake_function+0x20/0x20
>> [ 174.540049] kthread+0xe5/0x120
>> [ 174.540663] ? xenbus_printf+0x50/0x50
>> [ 174.541278] ? __kthread_init_worker+0x40/0x40
>> [ 174.541898] ret_from_fork+0x21/0x2c
>> [ 174.548635] switch: port 2(vif3.0) entered disabled state
>>
>> I believe calling vfree() when holding a spin_lock isn't a good idea.
>>
>
> Use vfree_atomic instead?

Hmm, isn't this overkill here?

You can just set a local variable with the address and do vfree() after
releasing the lock.


Juergen