Re: [PATCH -tip] remove the BKL: Replace BKL in mount/umountsyscalls with a mutex

From: Al Viro
Date: Fri Apr 17 2009 - 14:09:00 EST


On Fri, Apr 17, 2009 at 10:21:06AM -0700, Linus Torvalds wrote:

> Of course, right now we do hold the BKL over _multiple_ downcalls, so in
> that sense it's not actually totally 100% correct and straightforward to
> just move it down. Eg in the generic_shutdown_super() case we do
>
> lock_kernel();
> ->write_super();
> ->put_super();
> invalidate_inodes();
> unlock_kernel();
>
> and obviously if we split it up so that we push a lock_kernel() into both,
> we end up unlocking in between. I doubt anything cares, but it's still a
> technical difference.

No, that's OK. Anything that would expect on lack of blocking between
the callers of ->write_super() and ->put_super() is simply insane. Not
that other callers of ->write_super() had been under BKL, while we are
at it...

> There are similar issues with 'remount' holding the BKL over longer
> sequences.
>
> Btw, the superblock code really does seem to depend on lock_kernel. Those
> "sb->s_flags" accesses are literally not protected by anything else afaik.

Modifications in there *should* be protected by ->s_umount. Except that
emergency_remount() does down_read() instead of down_write(), for some
reason. And that fs going r/o on error very likely will not hold any
locks at all, BKL included.

Note that most of the readers really couldn't care less about protection.
Single-shot tests for one bit like "is this fs mounted noatime right now?"
are OK as is - we don't *care* if it races with remount and no way to
do anything about such race anyway.

Read-only is the main exception; we should be mostly OK since the per-vfsmount
r/o rework, but "I have an error and I'll go r/o now" stuff is still messy.

> That said, I think that fs/locks.c is likely a much bigger issue. Very few
> people care about any realtimeness of mount/unmount/remount. But file
> locking? That is much more likely to be an issue.

That is much more likely to require really non-trivial work, BTW. That code
is a *mess* and inventing sane locking for it will be painful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/