Re: [PATCH] bfs: put a inode if link count is 0

From: Al Viro
Date: Thu Jan 09 2025 - 02:32:55 EST


On Thu, Jan 09, 2025 at 02:39:39PM +0800, Lizhi Xu wrote:
> On Thu, 9 Jan 2025 06:22:16 +0000, Al Viro wrote:
> > > The reproducer performs the rename operation on the file twice in succession
> > > and changes the file to the same file name. After the first rename operation,
> > > the number of links in the inode is set to 0. In the second execution, the
> > > same inode is used, resulting in a 0 value warning for i_nlink.
> > >
> > > To avoid this issue, put the target inode before exiting the bfs_rename.
> >
> > This is completely insane - you get an extra drop of in-core inode
> > refcount, which *will* end up with dangling pointer and memory corruption.
> > Besides, there is a perfectly legitimate case when you open a file and
> > rename something on top of it. It MUST remain open and alive until the
> > last in-core reference to inode goes away, which must not happen before
> > close().
> In the reproducer, changes the file to the same file, the same file name is
> "file0", file0 uses mknod to create its inode and sets the i_nlink value to 1.
> There is no operation to open file0 in the reproducer. Is this situation also
> as you said?

I'm not sure I understand your sentence, to be honest.

Your patch is 100% wrong - you must *not*, under any circumstances, have
->rename() drop references to in-core struct inode instances. It's *always*
wrong; the reference to new_inode in new_dentry->d_inode remains there (as
it ought to) and its contribution to new_inode refcount remains unchanged.
It has nothing to do with ->i_nlink; you are decrementing ->i_count, which
controls the lifetime of in-core struct inode instance. As soon as that
reaches zero, struct inode instance will be freed. And destructor of
new_dentry *will* call iput() on its ->d_inode, so you'll end up with
attempt to decrement refcount of already freed memory object.

Again, that has nothing to do with ->i_nlink. I'm not familiar with
bfs layout, so I can't tell what's going on with the corrupted image syzbot
are messing with just by visual examination. Is there, by any chance,
a preexisting file0 in the root of that bfs image? Does mknod() succeed
there? Because if it doesn't and what you have is a buggered image with
'file0' being there having zero link count, yes, renaming over it would
trigger warnings about detected corrupted filesystem - we are trying to
remove a link (on disk) to something that claims (on disk) to have 0 links
at the time of that operation; something is clearly wrong and deserves
a warning.

Again, that iput() in there is basically introducing random memory
corruption; it might make the warning go away, but it's not a fix.