Re: [PATCH v2 2/3] ext4: avoid journaling sb update on error if journal is destroying
From: Ritesh Harjani (IBM)
Date: Sat Mar 08 2025 - 05:01:04 EST
Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx> writes:
> Presently we always BUG_ON if trying to start a transaction on a journal marked
> with JBD2_UNMOUNT, since this should never happen. However, while ltp running
> stress tests, it was observed that in case of some error handling paths, it is
> possible for update_super_work to start a transaction after the journal is
> destroyed eg:
>
> (umount)
> ext4_kill_sb
> kill_block_super
> generic_shutdown_super
> sync_filesystem /* commits all txns */
> evict_inodes
> /* might start a new txn */
> ext4_put_super
> flush_work(&sbi->s_sb_upd_work) /* flush the workqueue */
> jbd2_journal_destroy
> journal_kill_thread
> journal->j_flags |= JBD2_UNMOUNT;
> jbd2_journal_commit_transaction
> jbd2_journal_get_descriptor_buffer
> jbd2_journal_bmap
> ext4_journal_bmap
> ext4_map_blocks
> ...
> ext4_inode_error
> ext4_handle_error
> schedule_work(&sbi->s_sb_upd_work)
>
> /* work queue kicks in */
> update_super_work
> jbd2_journal_start
> start_this_handle
> BUG_ON(journal->j_flags &
> JBD2_UNMOUNT)
>
> Hence, introduce a new sbi flag s_journal_destroying to indicate journal is
> destroying only do a journaled (and deferred) update of sb if this flag is not
> set. Otherwise, just fallback to an un-journaled commit.
>
> We set sbi->s_journal_destroying = true only after all the FS updates are done
> during ext4_put_super() (except a running transaction that will get commited
> during jbd2_journal_destroy()). After this point, it is safe to commit the sb
> outside the journal as it won't race with a journaled update (refer
> 2d01ddc86606).
>
> Also, we don't need a similar check in ext4_grp_locked_error since it is only
> called from mballoc and AFAICT it would be always valid to schedule work here.
>
> Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available")
> Reported-by: Mahesh Kumar <maheshkumar657g@xxxxxxxxx>
> Suggested-by: Jan Kara <jack@xxxxxxx>
> Signed-off-by: Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx>
> ---
> fs/ext4/ext4.h | 2 ++
> fs/ext4/ext4_jbd2.h | 8 ++++++++
> fs/ext4/super.c | 4 +++-
> 3 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 2b7d781bfcad..d48e93bd5690 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1728,6 +1728,8 @@ struct ext4_sb_info {
> */
> struct work_struct s_sb_upd_work;
>
> + bool s_journal_destorying;
> +
> /* Atomic write unit values in bytes */
> unsigned int s_awu_min;
> unsigned int s_awu_max;
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index 9b3c9df02a39..6bd3ca84410d 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -437,6 +437,14 @@ static inline int ext4_journal_destroy(struct ext4_sb_info *sbi, journal_t *jour
> {
> int err = 0;
>
> + /*
> + * At this point all pending FS updates should be done except a possible
> + * running transaction (which will commit in jbd2_journal_destroy). It
> + * is now safe for any new errors to directly commit superblock rather
> + * than going via journal.
> + */
> + sbi->s_journal_destorying = true;
This is not correct right. I think what we decided to set this flag
before we flush the workqueue. So that we don't schedule any new
work after this flag has been set. At least that is what I understood.
[1]: https://lore.kernel.org/all/87eczc6rlt.fsf@xxxxxxxxx/
-ritesh
> +
> err = jbd2_journal_destroy(journal);
> sbi->s_journal = NULL;
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 8ad664d47806..31552cf0519a 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -706,7 +706,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
> * constraints, it may not be safe to do it right here so we
> * defer superblock flushing to a workqueue.
> */
> - if (continue_fs && journal)
> + if (continue_fs && journal && !EXT4_SB(sb)->s_journal_destorying)
> schedule_work(&EXT4_SB(sb)->s_sb_upd_work);
> else
> ext4_commit_super(sb);
> @@ -5311,6 +5311,8 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
> spin_lock_init(&sbi->s_error_lock);
> INIT_WORK(&sbi->s_sb_upd_work, update_super_work);
>
> + sbi->s_journal_destorying = false;
> +
> err = ext4_group_desc_init(sb, es, logical_sb_block, &first_not_zeroed);
> if (err)
> goto failed_mount3;
> --
> 2.48.1