Re: [PATCH 5.15 000/251] 5.15.47-rc2 review

From: Jan Kara
Date: Wed Jun 15 2022 - 06:04:46 EST


On Tue 14-06-22 11:51:35, Linus Torvalds wrote:
> On Tue, Jun 14, 2022 at 11:20 AM Thomas Backlund <tmb@xxxxxx> wrote:
> >
> > I "think" this is the suggested fix:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git/commit/?h=for_next&id=46b6418e26c7c26f98ff9c2c2310bce5ae2aa4dd
>
> Ugh, this is just too ugly for words.
>
> That's not a fix. That's a "hide the problem" patch.

I agree it is papering over the real problem. I consider that a stopgap
solution so that machines can boot until we find a cleaner solution.

> Now, admittedly clearly the "hide the problem" code already existed,
> and was just moved earlier, but I really think this whole "we're
> calling __mark_inode_dirty() on an inode that isn't even *initialized*
> yet" is a much deeper issue, and shouldn't have some hacky work-around
> in __mark_inode_dirty() that just happens to make it work.
>
> I don't mind that patch per se - moving the code is fine.
>
> But I *do* mind the patch when the reason is to hide that wrong
> ordering of operations.
>
> Now, maybe a proper fix might be to say that new_inode_pseudo() should
> always initialize i_state to I_DIRTY_ALL or something like that. The
> comment already says that they cannot participate in writeback, so
> maybe they should be disabled that way (ie a pseudo inode is always
> dirty and marking it dirty does nothing).

Sadly it is not so simple. Firstly, new_inode_pseudo() gets used for all
inodes (through new_inode()), secondly, tmpfs allocates fully standard
inodes through new_inode() as any other filesystem. We could check
writeback capabilities of the sb->s_bdi in new_inode_pseudo() but that
would not work for inodes that will become block device inodes because
blockdev_superblock has noop_backing_dev_info so we'd have to specialcase
that. Overall it looks a bit hairy to my taste.

> And then you get rid of the noop_backing_dev_info entirely.

And this would be even more difficult because there are other places that
expect there's *some* bdi associated with each sb.

> Or just make sure that noop_backing_dev_info is fully initialized
> before it's used.
>
> Because I think the real problem here is that things have a pointer to
> an uninitialized backing_dev_info.

I fully agree with this. IMHO we need to initialize noop_backing_dev_info
earlier but early init is not exactly my comfort zone so I have to verify
whether various stuff in cgwb_bdi_init() is safe to call so early...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR