Re: [cgroup or VFS ?] INFO: possible recursive locking detected

From: Peter Zijlstra
Date: Mon Feb 09 2009 - 06:49:20 EST


On Mon, 2009-02-09 at 11:23 +0000, Al Viro wrote:
> On Thu, Jan 08, 2009 at 11:45:43AM +0800, Li Zefan wrote:
> > Hi Al Viro,
> >
> > I hacked into the kernel with the patch below (I think It's ok for me
> > to comment out bdev->bd_mount_sem for testing):
>
> > And ran 2 threads:
> > for ((; ;)) # thread 1
> > {
> > mount -t ext3 /dev/sda9 /mnt1
> > umount /mnt1
> > }
> >
> > for ((; ;)) # thread 2
> > {
> > mount -t ext3 /dev/sda9 /mnt2
> > umount /mnt2
> > }
> >
> > And I got the same lockdep warning immediately, so I think it's
> > VFS's issue.
>
> It's a lockdep issue, actually. It _is_ a false positive; we could get rid
> of that if we took destroy_super(s); just before grab_super(), but I really
> do not believe that there's any point.
>
> Frankly, I'd rather see if there's any way to teach lockdep that this instance
> of lock is getting initialized into "one writer" state and that yes, we know
> that it's not visible to anyone, so doing that is safe, TYVM, even though
> we are under spinlock. Then take that sucker to just before set().
>
> In any case, I really do not believe that it might have anything to do with
> the WARN_ON() from another thread...
>
> Comments?

It seems to me we can simply put the new s_umount instance in a
different subclass. Its a bit unusual to use _nested for the outer lock,
but lockdep doesn't particularly cares about subclass order.

If there's any issue with the callers of sget() assuming the s_umount
lock being of sublcass 0, then there is another annotation we can use to
fix that, but lets not bother with that if this is sufficient.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
---
fs/super.c | 17 ++++++++++++++++-
1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 645e540..34ddc86 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -82,7 +82,22 @@ static struct super_block *alloc_super(struct file_system_type *type)
* lock ordering than usbfs:
*/
lockdep_set_class(&s->s_lock, &type->s_lock_key);
- down_write(&s->s_umount);
+ /*
+ * sget() can have s_umount recursion.
+ *
+ * When it cannot find a suitable sb, it allocates a new
+ * one (this one), and tries again to find a suitable old
+ * one.
+ *
+ * In case that succeeds, it will acquire the s_umount
+ * lock of the old one. Since these are clearly distrinct
+ * locks, and this object isn't exposed yet, there's no
+ * risk of deadlocks.
+ *
+ * Annotate this by putting this lock in a different
+ * subclass.
+ */
+ down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING);
s->s_count = S_BIAS;
atomic_set(&s->s_active, 1);
mutex_init(&s->s_vfs_rename_mutex);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/