Re: [PATCH v3 1/3] fs: speed up path lookup with cheaper handling of MAY_EXEC

From: Jan Kara
Date: Mon Nov 10 2025 - 05:18:03 EST


On Mon 10-11-25 10:46:38, Mateusz Guzik wrote:
> On Mon, Nov 10, 2025 at 10:32 AM Jan Kara <jack@xxxxxxx> wrote:
> >
> > On Fri 07-11-25 15:21:47, Mateusz Guzik wrote:
> > > The generic inode_permission() routine does work which is known to be of
> > > no significance for lookup. There are checks for MAY_WRITE, while the
> > > requested permission is MAY_EXEC. Additionally devcgroup_inode_permission()
> > > is called to check for devices, but it is an invariant the inode is a
> > > directory.
> > >
> > > Absent a ->permission func, execution lands in generic_permission()
> > > which checks upfront if the requested permission is granted for
> > > everyone.
> > >
> > > We can elide the branches which are guaranteed to be false and cut
> > > straight to the check if everyone happens to be allowed MAY_EXEC on the
> > > inode (which holds true most of the time).
> > >
> > > Moreover, filesystems which provide their own ->permission routine can
> > > take advantage of the optimization by setting the IOP_FASTPERM_MAY_EXEC
> > > flag on their inodes, which they can legitimately do if their MAY_EXEC
> > > handling matches generic_permission().
> > >
> > > As a simple benchmark, as part of compilation gcc issues access(2) on
> > > numerous long paths, for example /usr/lib/gcc/x86_64-linux-gnu/12/crtendS.o
> > >
> > > Issuing access(2) on it in a loop on ext4 on Sapphire Rapids (ops/s):
> > > before: 3797556
> > > after: 3987789 (+5%)
> > >
> > > Note: this depends on the not-yet-landed ext4 patch to mark inodes with
> > > cache_no_acl()
> > >
> > > Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>
> >
> > The gain is nice. I'm just wondering where exactly is it coming from? I
> > don't see that we'd be saving some memory load or significant amount of
> > work. So is it really coming from the more compact code and saved several
> > unlikely branches and function calls?
>
> That's several branches and 2 function calls per path component on the
> way to the terminal inode. In the path at hand, that's 10 function
> calls elided.

OK, the path lookup is really light so I guess 10 function calls are visible
enough. I guess this is hot enough path that the microoptimization is worth
the code duplication. So feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza

--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR