Re: [PATCH 0/7] breaking the global file_list_lock

From: Stephen Smalley
Date: Mon Jan 29 2007 - 08:38:48 EST


On Sun, 2007-01-28 at 15:11 +0000, Christoph Hellwig wrote:
> On Sun, Jan 28, 2007 at 02:43:25PM +0000, Christoph Hellwig wrote:
> > On Sun, Jan 28, 2007 at 12:51:18PM +0100, Peter Zijlstra wrote:
> > > This patch-set breaks up the global file_list_lock which was found to be a
> > > severe contention point under basically any filesystem intensive workload.
> >
> > Benchmarks, please. Where exactly do you see contention for this?
> >
> >
> > filesystem intensive workload apparently means namespace operation heavy
> > workload, right? The biggest bottleneck I've seen with those is dcache lock.
> >
> > Even if this is becoming a real problem there must be simpler ways to fix
> > this than introducing various new locking primitives and all kinds of
> > complexity.
>
> One good way to fix scalability without all this braindamage is
> to get rid of sb->s_files. Current uses are:
>
> - fs/dquot.c:add_dquot_ref()
>
> This performs it's actual operation on inodes. We should
> be able to check inode->i_writecount to see which inodes
> need quota initialization.
>
> - fs/file_table.c:fs_may_remount_ro()
>
> This one is gone in Dave Hansens per-mountpoint r/o patchkit
>
> - fs/proc/generic.c:proc_kill_inodes()
>
> This can be done with a list inside procfs.
>
> - fs/super.c:mark_files_ro()
>
> This one is only used for do_emergency_remount(), which is
> and utter hack. It might be much better to just deny any
> kind of write access through a superblock flag here.
>
> - fs/selinuxfs.c:sel_remove_bools()
>
> Utter madness. I have no idea how this ever got merged.
> Maybe the selinux folks can explain what crack they were
> on when writing this. The problem would go away with
> a generic rewoke infrastructure.

It was modeled after proc_kill_inodes(), as noted in the comments. So
if you have a suitable replacement for proc_kill_inodes(), you can apply
the same approach to sel_remove_bools().

>
> Once sb->s_files is gone we can also kill of fu_list entirely and
> replace it by a list head entirely in the tty code and make the lock
> for it per-tty.

--
Stephen Smalley
National Security Agency

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/