Re: [PATCH v2] ext4: fix fast commit inode enqueueing during a full journal commit

From: Jan Kara
Date: Tue May 28 2024 - 06:36:18 EST


On Mon 27-05-24 16:48:24, Luis Henriques wrote:
> On Mon 27 May 2024 09:29:40 AM +01, Luis Henriques wrote;
> >>> + /*
> >>> + * Used to flag an inode as part of the next fast commit; will be
> >>> + * reset during fast commit clean-up
> >>> + */
> >>> + tid_t i_fc_next;
> >>> +
> >>
> >> Do we really need new tid in the inode? I'd be kind of hoping we could use
> >> EXT4_I(inode)->i_sync_tid for this - I can see we even already set it in
> >> ext4_fc_track_template() and used for similar comparisons in fast commit
> >> code.
> >
> > Ah, true. It looks like it could be used indeed. We'll still need a flag
> > here, but a simple bool should be enough for that.
>
> After looking again at the code, I'm not 100% sure that this is actually
> doable. For example, if I replace the above by
>
> bool i_fc_next;
>
> and set to to 'true' below:
>
> >>> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> >>> index 87c009e0c59a..bfdf249f0783 100644
> >>> --- a/fs/ext4/fast_commit.c
> >>> +++ b/fs/ext4/fast_commit.c
> >>> @@ -402,6 +402,8 @@ static int ext4_fc_track_template(
> >>> sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ?
> >>> &sbi->s_fc_q[FC_Q_STAGING] :
> >>> &sbi->s_fc_q[FC_Q_MAIN]);
> >>> + else
> >>> + ei->i_fc_next = tid;
>
> ei->i_fc_next = true;
>
> Then, when we get to the ext4_fc_cleanup(), the value of iter->i_sync_tid
> may have changed in the meantime from, e.g., ext4_do_update_inode() or
> __ext4_iget(). This would cause the clean-up code to be bogus if it still
> implements a the logic below, by comparing the tid with i_sync_tid.
> (Although, to be honest, I couldn't see any visible effect in the quick
> testing I've done.) Or am I missing something, and this is *exactly* the
> behaviour you'd expect?

Yes, this is the behavior I'd expect. The rationale is that if i_sync_tid
points to the running transaction, it means the inode was modified in it,
which means fastcommit needs to write it out. In fact the
ext4_update_inode_fsync_trans() calls usually happen together with
ext4_fc_track_...() calls. This could use some cleanup so that we don't set
i_sync_tid in two places unnecessarily but that's for some other time...

Honza

--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR