Re: [PATCH v3 1/1] dcache: Translating dentry into pathname withouttaking rename_lock

From: Linus Torvalds
Date: Fri Sep 06 2013 - 20:19:43 EST


On Fri, Sep 6, 2013 at 5:00 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> Er... what will happen if you have done just what you've described and have
> a process call d_lookup()?

Umm. Yes?

What part of "one single path component" did you not get?

To repeat: d_lookup() NEVER LOOKS UP A WHOLE PATHNAME. It looks up
just a single path component. It matters not one whit whether you look
up a filename that is 1500 components deep, d_lookup() only ever works
on one single component. It's a single short hash chain lookup.

Sure, it can fail, but people really have to work at it. You're not
going to be able to make it fail by just calling "rename()" in a loop.
You're going to have to do multiple threads at least, and now you need
to do it on multiple different filesystems, since otherwise those
multiple threads are going to be serialized by the (outer)
per-filesystem vfs-layer rename semaphores. In other words, it sounds
impossible to trigger, wouldn't you say? Or if you try, you're going
to stand out for using a *lot* of resources.

In contrast, doing the getcwd() lookup really is following potentially
quite long chains.

So it's quite possible that just a single thread doing rename() in a
loop (on, say, /tmp, so that there isn't any IO) can trigger the
sequence write-lock frequently enough that traversing 1500 d_parent
entries might never complete.

Have I tried it? No. But think about the two different scenarios.
There really is a *big* difference between looking up one single
dentry from its parent using our hash tables, and traversing a
potentially almost unbounded parenthood chain.

(We're bounded in practice by PATH_MAX, so you can't make getcwd()
traverse more than about 2000 parents (single character filename plus
the slash for each level), and for all I know filesystems might cap it
before that, so it's not unbounded, but the difference between "1" and
"2000" is pretty damn big)

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/