Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality

From: Sergey Senozhatsky
Date: Thu Apr 30 2015 - 02:44:53 EST


On (04/30/15 15:34), Sergey Senozhatsky wrote:
> > Isn't it related to bd_mutex?
>
> I think it is:
>

I meant: I think it's related

> [ 216.713922] Possible unsafe locking scenario:
> [ 216.713923] CPU0 CPU1
> [ 216.713924] ---- ----
> [ 216.713925] lock(&bdev->bd_mutex);
> [ 216.713927] lock(s_active#162);
> [ 216.713929] lock(&bdev->bd_mutex);
> [ 216.713930] lock(s_active#162);
> [ 216.713932]
> *** DEADLOCK ***
>
> > I think the problem of deadlock is that you are trying to remove sysfs file
> > in sysfs handler.
> >
> > #> echo 1 > /sys/xxx/zram_remove
> >
> > kernfs_fop_write - hold s_active
> > -> zram_remove_store
> > -> zram_remove
> > -> sysfs_remove_group - hold s_active *again*
> >
> > Right?
> >
>
> are those same s_active locks?
>

I meant: sysfs handler, s_active locks are from different kernfs nodes.
it should be fine: we remove sysfs group from another sysfs group's handler
(zram_control_class_attrs removes zram_disk_attr_group).

-ss

>
> we hold (s_active#163) and (&bdev->bd_mutex) and want to acquire (s_active#162)
>
> [ 216.713934] 5 locks held by bash/342:
> [ 216.713935] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff811508a1>] vfs_write+0xaf/0x145
> [ 216.713938] #1: (&of->mutex){+.+.+.}, at: [<ffffffff811af1d3>] kernfs_fop_write+0x9c/0x14c
> [ 216.713942] #2: (s_active#163){.+.+.+}, at: [<ffffffff811af1dc>] kernfs_fop_write+0xa5/0x14c
> [ 216.713946] #3: (zram_index_mutex){+.+.+.}, at: [<ffffffffa022276f>] zram_remove_store+0x45/0xba [zram]
> [ 216.713950] #4: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffffa022267b>] zram_remove+0x41/0xf0 [zram]
>
>
> full log:
>
> [ 216.713826] ======================================================
> [ 216.713827] [ INFO: possible circular locking dependency detected ]
> [ 216.713829] 4.1.0-rc1-next-20150430-dbg-00010-ga86accf-dirty #121 Tainted: G O
> [ 216.713831] -------------------------------------------------------
> [ 216.713832] bash/342 is trying to acquire lock:
> [ 216.713833] (s_active#162){++++.+}, at: [<ffffffff811ae88d>] kernfs_remove_by_name_ns+0x70/0x8c
> [ 216.713840]
> but task is already holding lock:
> [ 216.713842] (&bdev->bd_mutex){+.+.+.}, at: [<ffffffffa022267b>] zram_remove+0x41/0xf0 [zram]
> [ 216.713846]
> which lock already depends on the new lock.
>
> [ 216.713848]
> the existing dependency chain (in reverse order) is:
> [ 216.713849]
> -> #1 (&bdev->bd_mutex){+.+.+.}:
> [ 216.713852] [<ffffffff8107d806>] __lock_acquire+0x10c2/0x11cb
> [ 216.713856] [<ffffffff8107e11c>] lock_acquire+0x13d/0x250
> [ 216.713858] [<ffffffff81528fc6>] mutex_lock_nested+0x5e/0x35f
> [ 216.713860] [<ffffffff81184148>] revalidate_disk+0x4b/0x7c
> [ 216.713863] [<ffffffffa02224d0>] disksize_store+0x1b1/0x1f4 [zram]
> [ 216.713866] [<ffffffff813f8994>] dev_attr_store+0x19/0x23
> [ 216.713870] [<ffffffff811afd84>] sysfs_kf_write+0x48/0x54
> [ 216.713872] [<ffffffff811af238>] kernfs_fop_write+0x101/0x14c
> [ 216.713874] [<ffffffff811502c2>] __vfs_write+0x26/0xbe
> [ 216.713877] [<ffffffff811508b2>] vfs_write+0xc0/0x145
> [ 216.713879] [<ffffffff81150fd0>] SyS_write+0x51/0x8f
> [ 216.713881] [<ffffffff8152d097>] system_call_fastpath+0x12/0x6f
> [ 216.713884]
> -> #0 (s_active#162){++++.+}:
> [ 216.713886] [<ffffffff8107b69e>] check_prevs_add+0x19e/0x747
> [ 216.713889] [<ffffffff8107d806>] __lock_acquire+0x10c2/0x11cb
> [ 216.713891] [<ffffffff8107e11c>] lock_acquire+0x13d/0x250
> [ 216.713892] [<ffffffff811adac4>] __kernfs_remove+0x1b6/0x2cd
> [ 216.713895] [<ffffffff811ae88d>] kernfs_remove_by_name_ns+0x70/0x8c
> [ 216.713897] [<ffffffff811b0872>] remove_files+0x42/0x67
> [ 216.713899] [<ffffffff811b0b39>] sysfs_remove_group+0x69/0x88
> [ 216.713901] [<ffffffffa02226a0>] zram_remove+0x66/0xf0 [zram]
> [ 216.713904] [<ffffffffa02227bf>] zram_remove_store+0x95/0xba [zram]
> [ 216.713906] [<ffffffff813fe053>] class_attr_store+0x1c/0x26
> [ 216.713909] [<ffffffff811afd84>] sysfs_kf_write+0x48/0x54
> [ 216.713911] [<ffffffff811af238>] kernfs_fop_write+0x101/0x14c
> [ 216.713913] [<ffffffff811502c2>] __vfs_write+0x26/0xbe
> [ 216.713915] [<ffffffff811508b2>] vfs_write+0xc0/0x145
> [ 216.713917] [<ffffffff81150fd0>] SyS_write+0x51/0x8f
> [ 216.713918] [<ffffffff8152d097>] system_call_fastpath+0x12/0x6f
> [ 216.713920]
> other info that might help us debug this:
>
> [ 216.713922] Possible unsafe locking scenario:
>
> [ 216.713923] CPU0 CPU1
> [ 216.713924] ---- ----
> [ 216.713925] lock(&bdev->bd_mutex);
> [ 216.713927] lock(s_active#162);
> [ 216.713929] lock(&bdev->bd_mutex);
> [ 216.713930] lock(s_active#162);
> [ 216.713932]
> *** DEADLOCK ***
>
> [ 216.713934] 5 locks held by bash/342:
> [ 216.713935] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff811508a1>] vfs_write+0xaf/0x145
> [ 216.713938] #1: (&of->mutex){+.+.+.}, at: [<ffffffff811af1d3>] kernfs_fop_write+0x9c/0x14c
> [ 216.713942] #2: (s_active#163){.+.+.+}, at: [<ffffffff811af1dc>] kernfs_fop_write+0xa5/0x14c
> [ 216.713946] #3: (zram_index_mutex){+.+.+.}, at: [<ffffffffa022276f>] zram_remove_store+0x45/0xba [zram]
> [ 216.713950] #4: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffffa022267b>] zram_remove+0x41/0xf0 [zram]
> [ 216.713954]
> stack backtrace:
> [ 216.713957] CPU: 1 PID: 342 Comm: bash Tainted: G O 4.1.0-rc1-next-20150430-dbg-00010-ga86accf-dirty #121
> [ 216.713958] Hardware name: SAMSUNG ELECTRONICS CO.,LTD Samsung DeskTop System/Samsung DeskTop System, BIOS 05CC 04/09/2010
> [ 216.713960] ffffffff82400210 ffff8800ba367a28 ffffffff815265b1 ffffffff810785f2
> [ 216.713962] ffffffff8242f970 ffff8800ba367a78 ffffffff8107aac7 ffffffff817bd85e
> [ 216.713965] ffff8800bdeca1a0 ffff8800bdeca9c0 ffff8800bdeca998 ffff8800bdeca9c0
> [ 216.713967] Call Trace:
> [ 216.713971] [<ffffffff815265b1>] dump_stack+0x4c/0x6e
> [ 216.713973] [<ffffffff810785f2>] ? up+0x39/0x3e
> [ 216.713975] [<ffffffff8107aac7>] print_circular_bug+0x2b1/0x2c2
> [ 216.713976] [<ffffffff8107b69e>] check_prevs_add+0x19e/0x747
> [ 216.713979] [<ffffffff8107d806>] __lock_acquire+0x10c2/0x11cb
> [ 216.713981] [<ffffffff8107e11c>] lock_acquire+0x13d/0x250
> [ 216.713983] [<ffffffff811ae88d>] ? kernfs_remove_by_name_ns+0x70/0x8c
> [ 216.713985] [<ffffffff811adac4>] __kernfs_remove+0x1b6/0x2cd
> [ 216.713987] [<ffffffff811ae88d>] ? kernfs_remove_by_name_ns+0x70/0x8c
> [ 216.713989] [<ffffffff811adca8>] ? kernfs_find_ns+0xcd/0x10e
> [ 216.713990] [<ffffffff81529294>] ? mutex_lock_nested+0x32c/0x35f
> [ 216.713992] [<ffffffff811ae88d>] kernfs_remove_by_name_ns+0x70/0x8c
> [ 216.713994] [<ffffffff811b0872>] remove_files+0x42/0x67
> [ 216.713996] [<ffffffff811b0b39>] sysfs_remove_group+0x69/0x88
> [ 216.713999] [<ffffffffa02226a0>] zram_remove+0x66/0xf0 [zram]
> [ 216.714001] [<ffffffffa02227bf>] zram_remove_store+0x95/0xba [zram]
> [ 216.714003] [<ffffffff813fe053>] class_attr_store+0x1c/0x26
> [ 216.714005] [<ffffffff811afd84>] sysfs_kf_write+0x48/0x54
> [ 216.714007] [<ffffffff811af238>] kernfs_fop_write+0x101/0x14c
> [ 216.714009] [<ffffffff811502c2>] __vfs_write+0x26/0xbe
> [ 216.714011] [<ffffffff8116b29b>] ? __close_fd+0x25/0xdd
> [ 216.714013] [<ffffffff81079a27>] ? __lock_is_held+0x3c/0x57
> [ 216.714015] [<ffffffff811508b2>] vfs_write+0xc0/0x145
> [ 216.714017] [<ffffffff81150fd0>] SyS_write+0x51/0x8f
> [ 216.714019] [<ffffffff8152d097>] system_call_fastpath+0x12/0x6f
> [ 216.714063] zram: Removed device: zram0
>
>
> -ss
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/