Re: [PATCH 00/11 RFC] Allow concurrent changes in a directory
From: Andreas Dilger
Date: Fri Dec 20 2024 - 15:55:44 EST
On Dec 19, 2024, at 7:54 PM, NeilBrown <neilb@xxxxxxx> wrote:
>
> A while ago I posted a patchset with a similar goal as this:
>
> https://lore.kernel.org/all/166147828344.25420.13834885828450967910.stgit@noble.brown/
>
> and recieved useful feedback. Here is a new version.
>
> This version is not complete. It does not change rename and does not
> change any filesystem to make use of the new opportunity for
> parallelism. I'll work on those once the bases functionality is agreed
> on.
>
> With this series, instead of a filesystem setting a flag to indiciate
> that parallel updates are support, there are now a new set of inode
> operations with a _shared prefix. If a directory provides a _shared
> interface it will be used with a shared lock on the inode, else the
> current interface will be used with an exclusive lock.
Hi Neil, thanks for the patch. One minor nit for the next revision
of the cover letter:
> Another motivation is lustre which
> can use a modified ext4 as the storage backend. One of the current
> modification is to allow concurrent updates in a directory as lustre uses a flat directory structure to store data.
This isn't really correct. Lustre uses a directory tree for the
namespace, but directories might become very large in some cases
with 1M+ cores working in a single directory (hey, I don't write
the applications, I just need to deal with them). The servers will
only have 500-2000 threads working on a single directory, but the
fine-grained locking on the servers is definitely a big win.
Being able to have parallel locking on the client VFS side would
also be a win, given that large nodes commonly have 192 or 256
cores/threads today. We know parallel directory locking will be
a win because mounting the filesystem multiple times on a single
client (which the VFS treats as multiple separate filesystems)
and running a multi-threaded benchmark in each mount in parallel
is considerably faster than running the same number of threads in
a single mountpoint.
Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP