Re: [PATCH] vfs: check dentry is still valid in get_link()
From: Christian Brauner
Date: Wed Jan 19 2022 - 04:05:55 EST
On Tue, Jan 18, 2022 at 04:04:50PM +0000, Al Viro wrote:
> On Tue, Jan 18, 2022 at 09:29:11AM +0100, Christian Brauner wrote:
> > On Mon, Jan 17, 2022 at 06:10:36PM +0000, Al Viro wrote:
> > > On Mon, Jan 17, 2022 at 04:28:52PM +0000, Al Viro wrote:
> > >
> > > > IOW, ->free_inode() is RCU-delayed part of ->destroy_inode(). If both
> > > > are present, ->destroy_inode() will be called synchronously, followed
> > > > by ->free_inode() from RCU callback, so you can have both - moving just
> > > > the "finally mark for reuse" part into ->free_inode() would be OK.
> > > > Any blocking stuff (if any) can be left in ->destroy_inode()...
> > >
> > > BTW, we *do* have a problem with ext4 fast symlinks. Pathwalk assumes that
> > > strings it parses are not changing under it. There are rather delicate
> > > dances in dcache lookups re possibility of ->d_name contents changing under
> > > it, but the search key is assumed to be stable.
> > >
> > > What's more, there's a correctness issue even if we do not oops. Currently
> > > we do not recheck ->d_seq of symlink dentry when we dismiss the symlink from
> > > the stack. After all, we'd just finished traversing what used to be the
> > > contents of a symlink that used to be in the right place. It might have been
> > > unlinked while we'd been traversing it, but that's not a correctness issue.
> > >
> > > But that critically depends upon the contents not getting mangled. If it
> > > *can* be screwed by such unlink, we risk successful lookup leading to the
> >
> > Out of curiosity: whether or not it can get mangled depends on the
> > filesystem and how it implements fast symlinks or do fast symlinks
> > currently guarantee that contents are mangled?
>
> Not sure if I understand your question correctly...
>
> Filesystems should guarantee that the contents of string returned
> by ->get_link() (or pointed to by ->i_link) remains unchanged for as long
> as we are looking at it (until fs/namei.c:put_link() that drops it or
> fs/namei.c:drop_links() in the end of pathwalk). Fast symlinks or not -
> doesn't matter.
Yep, got that.
>
> The only cases where that does not hold (there are two of them in
> the entire kernel) happen to be fast symlinks. Both cases are bugs.
Ok, that's what I was essentially after whether or not they were bugs in
the filesystems or it's a generic bug.
> ext4 case is actually easy to fix - the only reason it ends up mangling
> the string is the way ext4_truncate() implements its check for victim
> being a fast symlink (and thus needing no work). It gets disrupted
> by zeroing ->i_size, which we need to do in this case (inode removal).
> That's not hard to get right.
Oh, I see, it zeroes i_size and erases i_data which obviously tramples
the fast symlink contents.
Given that ext4 makes use of i_flags for their ext4 inode containers why
couldn't this just be sm like
#define EXT4_FAST_SYMLINK 0x<some-free-value>
if (EXT4_I(inode)->i_flags & EXT4_FAST_SYMLINK)
return <im-a-fast-symlink>;
? Which seems simpler and more obvious to someone reading that code than
logic based on substracting blocks or checking i_size.