Re: [git pull] more vfs bits

From: Linus Torvalds
Date: Sat Feb 21 2015 - 20:15:04 EST

On Sat, Feb 21, 2015 at 4:18 PM, David Howells <dhowells@xxxxxxxxxx> wrote:
> So what I want to do is:
> (1) Introduce wrappers around accesses to ->d_inode. These can be used to
> mark such accesses as being classified into two categories:
> (A) Some accesses should only consider the dentry they're given.
> Filesystems would fall into this category when dealing with their own
> dentries and inodes.

.. and this is the one that makes no sense to me.

It's the common case, and I don't see how it *possibly* adds any
value. The "I want the inode of this dentry" is traditionally done as

What is the *upside* of the wrapper?

> (B) Other accesses should fall through from the dentry they're given to
> an underlying layer. A lot of non-filesystem accesses, including
> those in the VFS side of pathwalk, fall into this category, as do
> accesses to external objects made by filesystems such as overlayfs
> and ecryptfs.

Again, if they actually want something *else* than the "native inode",
then at that point a wrapper makes sense.

If you actually want to document that "this use wants the underlying
inode", then go wild. But that is *different* from all the common uses
inside random filesystems that just want "what's the inode of this

IOW, my argument is that I cannot see *any* possible reason to wrap
the normal "give me the inode associated with this dentry".

I *can* see a reason to wrap - and document - the cases that are *not*
that kind of simple thing. I *can* see the reason for saying "give me
the inode of the lower layer", or "give me the inode, or NULL if it's
a whiteout entry".

But it's just that empty "fs_inode()" wrapper itself that I just don't
see the point of.

Every single low-level filesystem that does all the actual core
lookup/mkdir/create/whatever operations care about the native inode.
You seem to even kind of acknowledge that in the naming: "fs_inode()".
And my argument is that there is never any possible reason why that
would ever be anything but "dentry->d_inode".

So the wrapper doesn't actually help anything, and it *does* obfuscate
things. It makes people go "whats' the difference between "fs_inode()"
and "fs_inode_once()". It quite possible makes people think it's an
expensive operation. Who knows. It seems pointless.

> (6) Add a DCACHE_WHITEOUT_TYPE type variant for dentries that will in the
> future be recognised by the VFS as a 'negative' type, but can be used by
> the overlay or unionmount to note a difference to an ordinary negative
> type dentry that can also be a fallthrough (at least in unionmount if
> this ever makes it).

Right. And a helper/wrapper makes sense for those cases, and actually
clarifies that "this particular code knows and cares about whiteout

I'm not arguing against those kinds of wrappers that actually
_clarify_ things. I'm arguing against random wrappers that don't, and
that really fundamentally seem like they cannot possibly ever have any
other valid value.

A low-level filesystem may have a real inode associated with a
whiteout dentry, it could be a device inode with a zero major number
or some other special inode (or it might not be an inode at alln - it
could easily also be just a directory entry type). Such a filesystem
would clearly care about the inode. and would actually really care
about "dentry->d_inode".

And I actually think that having "dentry->d_inode" is *clearer* than
"fs_ionode(dentry)". It's clear that that is a naked native access.
It's not some hidden abstracted case. That's *the* inode associated
with the dentry.

> (7) Where feasible, replace if-statements that check the positivity or
> negativity of dentries by doing if (...->d_inode) with checks on the type
> of the dentry.

.. and this is again the kind of wrapper I think is *good*. It's
abstracting some real issue.

I don't object at all to abstracting out "ok, a dentry is negative if
it has a NULL inode, or if it is marked as a whiteout entry". I think
that writing

if (d_negative(dentry)) ...

is more readable than

if (!dentry->d_inode || (dentry->d_flags & DCACHE_WHITEOUT)) ..

or variations of that (I guess it's "IS_WHITEOUT(dentry->d_inode)" right now).

So I'm not against helpers that do something meaningful.

And there are downsides to arbitrary random wrappers/helpers. Churn.
Harder to see what the code actually *does*. Bad naming (dentry
operations are generally called "d_xyz()" or "dentry_xyz()"). Does it
do soemthign else? Are there rules for calling this? All the mental
rules you have to have, and that *change* just because you change the

>> Now, that was true in the "bad old days" when we just used
>> ACCESS_ONCE(dentry->d_inode) too, but at least in that old model we
>> don't have some idiotic wrapper around it.
> I can't make ACCESS_ONCE(fs_inode(dentry)) work if fs_inode() is an inline
> function. I might be able to make it work if it's a macro. I also don't want
> to call ACCESS_ONCE() inside fs_inode().

Oh, I see *why* you did it, given that you wanted a wrapper.

But I also see this very much as an example of "the wrapper is
actually causing more problems than it's solving". The fact that the
first wrapper exists now means that you have to use *another* wrapper.

And dammit, it's not *helping*.

Now, if it was something like "ok, the revalidate() call is very very
special for a low-level filesystem, in that the filesystem cannot
trust the inode pointer, so we have a special
"d_revalidate_inode(dentry) wrapper that does ACCESS_ONCE()", then
such a wrapper would actually be *documentation*, and would kind of
show something real ("revalidate is special").

See what I'm trying to say? At that point - even if the wrapper
doesn't *do* anything - it at least documents some rule, and migth be
worth it for that reason.

But what does "fs_inode_once()" document? Nothing. It's just an
artifact of you having introduced a wrapper and done a
search-and-replace. So now you have the extra abstraction of a
wrapper, with all the disadvantages of abstractions, and none of the
advantages that abstraction is supposed to actually bring us.

> Suggest a better name that's not too long then please. fs_inode(d) is about
> the same length as d->d_inode which means I can just script it and I don't
> have to go in and reformat the code.

So I didn't check, but I *think* the only valid use for the
ACCESS_ONCE() on ->d_inode is for revalidate, which is kind of special
in that we call it without locks held. So *if* that is true, then
maybe "d_revalidate_inode()" would be a reasonable name. But again, I
didn't check that semantic rule, so maybe it doesn't work.

And yes, I'm annoyed, because I really hate the timing of this pull
request. So it's not just about the wrapper, it's about when/how this
all reached me.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at