Re: [PATCH v4] fs: add predicts based on nd->depth

From: Mateusz Guzik

Date: Tue Nov 11 2025 - 13:00:37 EST

On Tue, Nov 11, 2025 at 4:32 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Nov 10, 2025 at 05:59:01PM +0100, Mateusz Guzik wrote:
>
> > Given these results:
> > 1. terminate_walk() is called towards the end of the lookup. I failed
> > run into a case where it has any depth to clean up. For now predict
> > it does not.
>
> Easy - just have an error while resolving a nested symlink in the middle
> of pathname. On success it will have zero nd->depth, of course.
>

Ok, I'll update the commit message.

> > 2. legitimize_links() is also called towards the end of lookup and most
> > of the time there s 0 depth. Patch consumers to avoid calling into it
> > in that case.
>
> On transition from lazy to non-lazy mode on cache miss, ->d_revalidate()
> saying "dunno, try in non-lazy mode", etc.
>
> That can happen inside a nested symlink as well as on the top level, but
> the latter is more common on most of the loads.
>

Given the rest of the e-mail I take it you are clarifying when depth >
0 in this case, as opposed to contesting the predict.

> > 3. walk_component() is typically called with WALK_MORE and zero depth,
> > checked in that order. Check depth first and predict it is 0.
>
> Does it give a measurable effect?

I did not benchmark this patch at all, merely checked for predicts
going the right way with bpftrace. What do I intend to benchmark is
the following: cleanup and inlining of func calls to walk_component()
and step_into(), which happen every time.

The routine got ->depth predicts in this patch because I was already
sprinkling it.

The current asm of walk_component() is penalized by lookup_slow(!)
being inlined and convincing the compiler to spill extra registers.
step_into() also looks like it can be shortened for the common case.

When inlined the compiler should be able to elide branching on WALK_MORE anyway.

You can find profiling info from a kernel running my patches (modulo
this one) here:
https://lore.kernel.org/linux-fsdevel/20251111-brillant-umgegangen-e7c891513bce@brauner/T/#mee42496c2c6495a54c6b0b4da5c4121540ad92d9

walk_component() is hanging out there at over 2.7% and step_into() is
4.8%. I would suspect there is at least 1% total hiding here for
trivial fixups + inlining.

That said, I can drop this bit no problem and add it later to the
patchset which takes care of both walk_component() and step_into()
(which will come with benchmarks).

>
> > 4. link_path_walk() predicts not dealing with a symlink, but the other
> > part of symlink handling fails to make the same predict. Add it.
>
> Unconvincing, that - one is "we have a component; what are the odds of that
> component being a symlink?", another - "we have reached the end of pathname
> or the end of nested symlink; what are the odds of the latter being the case?"
>

Fair, so I added the following crapper:

diff --git a/fs/dcache.c b/fs/dcache.c
index de3e4e9777ea..2bac37c09bf6 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3268,3 +3268,7 @@ void __init vfs_caches_init(void)
bdev_cache_init();
chrdev_init();
}
+
+
+void link_path_walk_probe(int);
+void link_path_walk_probe(int) { }
diff --git a/fs/namei.c b/fs/namei.c
index caeed986108d..00125c578af4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2470,6 +2470,8 @@ static inline const char *hash_name(struct
nameidata *nd, const char *name, unsi
#define LAST_WORD_IS_DOTDOT 0x2e2e
#endif

+void link_path_walk_probe(int);
+
/*
* Name resolution.
* This is the basic name resolution function, turning a pathname into
@@ -2544,6 +2546,7 @@ static int link_path_walk(const char *name,
struct nameidata *nd)
} while (unlikely(*name == '/'));
if (unlikely(!*name)) {
OK:
+ link_path_walk_probe(depth);
/* pathname or trailing symlink, done */
if (likely(!depth)) {
nd->dir_vfsuid =
i_uid_into_vfsuid(idmap, nd->inode);

then probing over kernel build:
bpftrace -e 'kprobe:link_path_walk_probe { @[probe] = lhist(arg0, 0, 8, 1); }'
@[kprobe:link_path_walk_probe]:
[0, 1) 7528231 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2) 407905 |@@ |

I think that's nicely skewed.

> I can believe that answers to both questions are fairly low, but they are
> not the same. I'd expect the latter to be considerably higher than the
> former.
>
> > - if (unlikely(!legitimize_links(nd)))
> > + if (unlikely(nd->depth && !legitimize_links(nd)))
>
> I suspect that
> if (unlikely(nd->depth) && !legitimize_links(nd))
> might be better...
>

Well it says it is unlikely there is depth, but when there is and
legitimize_links is called, it is unlikely it fails. I think this
matches the current intent, except avodiing the func call most of the
time.