Re: [GIT PULL 05/12 for v7.1] vfs fs_struct
From: Linus Torvalds
Date: Mon Apr 13 2026 - 15:18:47 EST
On Fri, 10 Apr 2026 at 08:18, Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> Avoid excessive dput/dget in audit_context setup and reset paths
I really don't like this one, and I'm going to at least delay pulling it.
Because this really feels wrong to me. It feels like a hack. Just
looking at some of the paths that get modified, we have that
copy_fs_struct, which does
path_get(&fs->root);
to increment the root, and then the very next line ends up being
get_fs_pwd_pool_locked(old, &fs->pwd);
that copies the pwd very differently.
And the only reason 'pwd' is different is because the audit code
treats pwd and root differently.
And that, in turn, is because the audit code is just historical and
broken, and can't deal with chroot environments.
The pwd and root paths are two sides of the same coin, and any code
that treats them this differently is *wrong*.
So this basically takes a audit mistake, and makes that mistake
visible to non-audit users.
I absolutely hate it.
I'm going to think about this more, and maybe I'll end up saying "it's
still ugly, but whatever, I'll pull it anyway".
But right now I'm going "there HAS to be a better way".
And maybe you can convince me this hackery is the right thing, but it
really is rubbing me the wrong way entirely.
The hackery is evident everywhere. Like that
/* Release the extra pwd references of new_fs, if present. */
while (new_fs && new_fs->pwd_refs) {
path_put(&new_fs->pwd);
new_fs->pwd_refs--;
}
because it's obviously too painful to switch everything over to having
a "decrement by more than one", and this codepath is simply not
*worth* the pain, because it's just for a stupid audit special case
that is largely historical.
So I see why this all is done, but at the same time it realy strikes
me as being very very ugly.
No, I don't have a solution.
My gut feel is that
(a) audit is wrong to begin with to think that only 'pwd' is special
and that root is always printed as NULL
(b) audit should not be taking a ref 'struct path' in the first
place, but to 'struct fs_struct'.
Again, treating 'pwd' differently from root is a fundamental mistake.
Always was. The only difference between the two is whether you had '/'
at the beginning of a path or not (and the trivial 'different system
calls to set them').
But as long as it was a mistake just inside the audit path, who cares?
But with this pull it's just broken garbage escaping outside those confines.
So for example, I'd be a *lot* happier with some kind of "borrow the
whole 'struct fs_struct'" model, where audit gets a new "borrow
reference" to the whole struct, and then the places that actually
change 'struct fs_struct' would create a copy of it and replace any
pending audit users with that copy.
There are exactly three such places that change it, all right next to
each other: set_fs_root(), set_fs_pwd() and chroot_fs_refs().
And we already have the lock that protects all this (fs->seq), and we
could add a new 'struct audit_borrow_struct" that audit uses, which
has a pointer to the 'struct fs_struct',
And then audit would borrow the whole thing, we'd keep a list of
current borrowers and unborrow it at the end, and absolutely *NOTHING*
would change in the core path handling.
The couple of places that change 'struct fs_struct' could then just
look at the list of borrowers synchronize them (create a new fake
'struct fs_struct' just for them, and maybe increment refs at *that*
point).
And it would actually *improve* on the audit code, because now audit
would at least *have* the 'root' path too, even though it obviously
doesn't use it and always thinks root is a global root thing.
And no, nobody would ever bother to fix audit, but at least
_conceptually_ it would set the infrastructure for the audit paths
getting this right.
Yes, yes, the above is very very handwavy and I didn't write the code,
and possibly the rantings of an insane person.
I didn't think a *lot* about this. I just looked at this patch series
and threw up in my mouth a little, and tried to think of "maybe we
could do it differently" and the above is a rough draft of something
that might work.
Please? Humor me and think about it, at least.
Linus