Re: [PATCH] bfs: put a inode if link count is 0
From: Lizhi Xu
Date: Thu Jan 09 2025 - 02:40:10 EST
On Thu, 9 Jan 2025 07:32:28 +0000, Al Viro wrote:
> > On Thu, 9 Jan 2025 06:22:16 +0000, Al Viro wrote:
> > > > The reproducer performs the rename operation on the file twice in succession
> > > > and changes the file to the same file name. After the first rename operation,
> > > > the number of links in the inode is set to 0. In the second execution, the
> > > > same inode is used, resulting in a 0 value warning for i_nlink.
> > > >
> > > > To avoid this issue, put the target inode before exiting the bfs_rename.
> > >
> > > This is completely insane - you get an extra drop of in-core inode
> > > refcount, which *will* end up with dangling pointer and memory corruption.
> > > Besides, there is a perfectly legitimate case when you open a file and
> > > rename something on top of it. It MUST remain open and alive until the
> > > last in-core reference to inode goes away, which must not happen before
> > > close().
> > In the reproducer, changes the file to the same file, the same file name is
> > "file0", file0 uses mknod to create its inode and sets the i_nlink value to 1.
> > There is no operation to open file0 in the reproducer. Is this situation also
> > as you said?
>
> I'm not sure I understand your sentence, to be honest.
>
> Your patch is 100% wrong - you must *not*, under any circumstances, have
> ->rename() drop references to in-core struct inode instances. It's *always*
> wrong; the reference to new_inode in new_dentry->d_inode remains there (as
> it ought to) and its contribution to new_inode refcount remains unchanged.
> It has nothing to do with ->i_nlink; you are decrementing ->i_count, which
> controls the lifetime of in-core struct inode instance. As soon as that
> reaches zero, struct inode instance will be freed. And destructor of
> new_dentry *will* call iput() on its ->d_inode, so you'll end up with
> attempt to decrement refcount of already freed memory object.
>
> Again, that has nothing to do with ->i_nlink. I'm not familiar with
> bfs layout, so I can't tell what's going on with the corrupted image syzbot
> are messing with just by visual examination. Is there, by any chance,
> a preexisting file0 in the root of that bfs image? Does mknod() succeed
> there? Because if it doesn't and what you have is a buggered image with
> 'file0' being there having zero link count, yes, renaming over it would
> trigger warnings about detected corrupted filesystem - we are trying to
> remove a link (on disk) to something that claims (on disk) to have 0 links
> at the time of that operation; something is clearly wrong and deserves
> a warning.
>
> Again, that iput() in there is basically introducing random memory
> corruption; it might make the warning go away, but it's not a fix.
I also have deep doubts about using iput in rename.
Thanks for your analysis.
Lizhi