Re: [PATCH] deadlock with suspend and quotas
From: Valerie Aurora
Date: Mon Nov 28 2011 - 16:00:20 EST
On Mon, Nov 28, 2011 at 8:04 AM, Jan Kara <jack@xxxxxxx> wrote:
> Hello,
>
> On Fri 25-11-11 15:25:16, Mikulas Patocka wrote:
>> This script causes a kernel deadlock:
>> #!/bin/sh
>> set -e
>> DEVICE=/dev/vg1/linear
>> lvchange -ay $DEVICE
>> mkfs.ext3 $DEVICE
>> mount -t ext3 -o usrquota,grpquota $DEVICE /mnt/test
>> quotacheck -gu /mnt/test
>> umount /mnt/test
>> mount -t ext3 -o usrquota,grpquota $DEVICE /mnt/test
>> quotaon /mnt/test
>> dmsetup suspend $DEVICE
>> setquota -u root 1 2 3 4 /mnt/test &
>> sleep 1
>> dmsetup resume $DEVICE
>>
>> setquota acquired semaphore s_umount for read and then tried to perform
>> a transaction (and waits because the device is suspended).
>> dmsetup resume tries to acquire s_umount for write before resuming the device
>> (and waits for setquota).
>>
>> Here are stacktraces:
>> setquota:
>> [ 67.524456] [<ffffffff810aa84e>] ? get_page_from_freelist+0x31e/0x790
>> [ 67.524529] [<ffffffffa0250265>] ? start_this_handle.isra.9+0x265/0x3b0 [jbd]
>> [ 67.524604] [<ffffffff8105bc00>] ? add_wait_queue+0x60/0x60
>> [ 67.524675] [<ffffffffa02505a1>] ? journal_start+0xc1/0x100 [jbd]
>> [ 67.524742] [<ffffffff810e62d6>] ? kmem_cache_alloc+0xf6/0x1b0
>> [ 67.524808] [<ffffffffa028018d>] ? ext3_acquire_dquot+0x3d/0x80 [ext3]
>> [ 67.524872] [<ffffffff81143749>] ? dqget+0x359/0x3b0
>> [ 67.524916] [<ffffffff81143ad4>] ? dquot_get_dqblk+0x14/0x1b0
>> [ 67.524985] [<ffffffff81147c34>] ? quota_getquota+0x24/0xd0
>> [ 67.525048] [<ffffffff810ff3db>] ? do_path_lookup+0x2b/0x90
>> [ 67.525082] [<ffffffff810ff8cd>] ? kern_path+0x1d/0x40
>> [ 67.525134] [<ffffffff81148311>] ? do_quotactl+0x421/0x540
>> [ 67.525191] [<ffffffff811073f0>] ? dput+0x20/0x230
>> [ 67.525234] [<ffffffff81148507>] ? sys_quotactl+0xd7/0x1a0
>> [ 67.525304] [<ffffffff8130a03b>] ? system_call_fastpath+0x16/0x1b
>>
>> dmsetup resume:
>> [ 67.525887] [<ffffffffa0238280>] ? dev_wait+0xc0/0xc0 [dm_mod]
>> [ 67.525948] [<ffffffff81309225>] ? rwsem_down_failed_common+0xc5/0x160
>> [ 67.526013] [<ffffffff81198a43>] ? call_rwsem_down_write_failed+0x13/0x20
>> [ 67.526058] [<ffffffff81308adc>] ? down_write+0x1c/0x1d
>> [ 67.526103] [<ffffffff810f3a91>] ? thaw_super+0x21/0xc0
>> [ 67.526166] [<ffffffff81124d4d>] ? thaw_bdev+0x6d/0x90
>> [ 67.526223] [<ffffffff8105583e>] ? queue_work+0x4e/0x60
>> [ 67.526269] [<ffffffffa0230e63>] ? unlock_fs+0x23/0x40 [dm_mod]
>> [ 67.526341] [<ffffffffa02336d0>] ? dm_resume+0xb0/0xd0 [dm_mod]
>> [ 67.526388] [<ffffffffa0238420>] ? dev_suspend+0x1a0/0x230 [dm_mod]
>> [ 67.526441] [<ffffffffa0238a59>] ? ctl_ioctl+0x159/0x2a0 [dm_mod]
>> [ 67.526510] [<ffffffff8116c4ee>] ? ipc_addid+0x4e/0xd0
>> [ 67.526555] [<ffffffffa0238bae>] ? dm_ctl_ioctl+0xe/0x20 [dm_mod]
>> [ 67.526620] [<ffffffff811025de>] ? do_vfs_ioctl+0x8e/0x4e0
>> [ 67.526670] [<ffffffff811073f0>] ? dput+0x20/0x230
>> [ 67.526737] [<ffffffff810f3112>] ? fput+0x162/0x220
>> [ 67.526783] [<ffffffff81102a79>] ? sys_ioctl+0x49/0x90
>> [ 67.526838] [<ffffffff8130a03b>] ? system_call_fastpath+0x16/0x1b
>>
>> The following patch fixes the deadlock. When the quota subsystem takes s_umount,
>> it checks if the filesystem is frozen. If it is, we drop s_umount, wait for
>> the filesystem to resume and retry.
> Thanks for the patch. I'm aware of the deadlock and Val Henson is working
> on resolving these types of deadlocks more systematically. But since I
> haven't heard from her for a while, I guess I'll merge your fix and she'll
> update her series to reflect your change since those patches are going to
> go in at earliest in the next merge window.
> To sum up: I've merged your patch.
FYI, I no longer have time to consult in addition to my full-time job.
Canonical is taking over this patch set.
-VAL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/