Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

From: NeilBrown
Date: Fri May 15 2015 - 19:30:42 EST


On Fri, 15 May 2015 15:15:48 -0600 Andreas Dilger <adilger@xxxxxxxxx> wrote:

> On May 14, 2015, at 5:23 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote:
> >> On Wed, May 13, 2015 at 8:30 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>> Maybe... I'd like to see the profiles, TBH - especially getxattr() and
> >>> access() frequency on various loads. Sure, make(1) and cc(1) really care
> >>> about stat() very much, but I wouldn't be surprised if something like
> >>> httpd or samba would be hitting getxattr() a lot...
> >>
> >> So I haven't seen samba profiles in ages, but iirc we have more
> >> serious problems than trying to speed up basic filename lookup.
> >>
> >> At least long long ago, inode semaphore contention was a big deal,
> >> largely due to readdir().
> >
> > It still is - it's the prime reason people still need to create
> > hashed directory structures so that they can get concurrency in
> > directory operations. IMO, concurrency in directory operations is a
> > more important problem to solve than worrying about readdir speed;
> > in large filesystems readdir and lookup are IO bound operations and
> > so everything serialises on the IO as it's done with the i_mutex
> > held....
>
> We've had a patch[*] to add ext4 parallel directory operations in Lustre for
> a few years, that adds separate locks for each internal tree and leaf block
> instead of using i_mutex, so it scales as the size of the directory grows.
> This definitely improved many-threaded directory create/lookup/unlink
> performance (rename still uses a single lock).

.. and I've been wondering what to do about i_mutex and NFS. I've had
customer reports of slowness in creating files that seems to be due to
i_mutex on the directory being held over the whole 'create' RPC, so only one
of those can be in flight at the one time.
"make -j" on a large source directory can easily want to create lots of
"*.o" files at "the same time".

And NFS doesn't need i_mutex at all because the server will provide the
needed guarantees.

NeilBrown

Attachment: pgpd8SVR0ZZy4.pgp
Description: OpenPGP digital signature