Re: [PATCH v1] ext4: fix use-after-free in update_super_work when racing with umount
From: Jan Kara
Date: Wed Mar 18 2026 - 13:10:01 EST
On Wed 18-03-26 13:04:03, Jiayuan Chen wrote:
>
> On 3/17/26 7:38 PM, Jan Kara wrote:
> > On Fri 13-03-26 14:52:04, Jiayuan Chen wrote:
> > > Commit b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
> > > moved ext4_unregister_sysfs() before flushing s_sb_upd_work to prevent new
> > > error work from being queued via /proc/fs/ext4/xx/mb_groups reads during
> > > unmount. However, this introduced a use-after-free because
> > > update_super_work calls ext4_notify_error_sysfs() -> sysfs_notify() which
> > > accesses the kobject's kernfs_node after it has been freed:
> > >
> > > update_super_work ext4_put_super
> > > ----------------- --------------
> > > ext4_unregister_sysfs(sb)
> > > kobject_del(&sbi->s_kobj)
> > > __kobject_del()
> > > sysfs_remove_dir()
> > > kobj->sd = NULL
> > > sysfs_put(sd)
> > > kernfs_put() // RCU free
> > > ext4_notify_error_sysfs(sbi)
> > > sysfs_notify(&sbi->s_kobj)
> > > kn = kobj->sd // stale pointer
> > > kernfs_get(kn) // UAF on freed kernfs_node
> > > ext4_journal_destroy()
> > > flush_work(&sbi->s_sb_upd_work)
> > >
> > > The original blamed commit needed procfs removal before the work
> > > flush to prevent /proc/fs/ext4/xx/mb_groups reads from queuing new error
> > > work. But it bundled procfs removal and kobject_del together in
> > > ext4_unregister_sysfs(), causing the sysfs kobject to be torn down too
> > > early.
> > >
> > > The correct teardown ordering has three constraints:
> > >
> > > 1. procfs removal must happen before flushing s_sb_upd_work, to prevent
> > > /proc reads from queuing new error work that would BUG_ON.
> > > 2. sysfs kobject removal must happen after flushing s_sb_upd_work, since
> > > the work calls sysfs_notify() which accesses the kernfs_node.
> > > 3. sysfs kobject removal must happen before jbd2_journal_destroy(), since
> > > userspace could read the journal_task sysfs attribute and dereference
> > > j_task after the journal thread has been killed.
> > >
> > > Fix this by:
> > > - Adding ext4_sb_release_proc() to remove procfs entries early.
> > > - Splitting ext4_journal_destroy() into ext4_journal_stop_work() and
> > > ext4_journal_finish(), so that ext4_unregister_sysfs() can be placed
> > > between them to satisfy all three ordering constraints.
> > >
> > > Fixes: b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
> > > Cc: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
> > > Signed-off-by: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
> > Thanks for the analysis and the patch! I fully agree with your analysis but
> > I think your solution shows that the teardown sequence is just too subtle
> > (too many different dependencies). Ideally, we'd instead modify
> > ext4_notify_error_sysfs() to detect sysfs is already torn down by
> > ext4_unregister_sysfs() and do nothing. We can check
> > sbi->s_kobj.state_in_sysfs for that. The only trouble with this is that
> > sysfs_notify() could still race with kobject_del() so we also need some
> > locking in ext4_unregister_sysfs() locking out parallel
> > ext4_notify_error_sysfs() and we probably need to introduce a separate
> > mutex for that. What do you think?
> >
> > Honza
> >
>
>
> Hi Honza,
>
> Thanks for the suggestion !
>
> I tried the approach you described — having ext4_notify_error_sysfs() check
> s_kobj.state_in_sysfs and using a
> dedicated mutex to serialize against kobject_del() in
> ext4_unregister_sysfs(). It is indeed much simpler than my original
> approach of reordering the teardown sequence.
Yes, the patch looks good to me. Please create a proper patch submission
with changelog etc. and I can give you my reviewed-by tag :).
Honza
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index b76966dc06c0..c012e85d2515 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1574,6 +1574,7 @@ struct ext4_sb_info {
> struct proc_dir_entry *s_proc;
> struct kobject s_kobj;
> struct completion s_kobj_unregister;
> + struct mutex s_error_notify_mutex; /* protects sysfs_notify vs
> kobject_del */
> struct super_block *s_sb;
> struct buffer_head *s_mmp_bh;
>
> diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
> index d2ecc1026c0c..f2505988e010 100644
> --- a/fs/ext4/sysfs.c
> +++ b/fs/ext4/sysfs.c
> @@ -597,7 +597,10 @@ static const struct kobj_type ext4_feat_ktype = {
>
> void ext4_notify_error_sysfs(struct ext4_sb_info *sbi)
> {
> - sysfs_notify(&sbi->s_kobj, NULL, "errors_count");
> + mutex_lock(&sbi->s_error_notify_mutex);
> + if (sbi->s_kobj.state_in_sysfs)
> + sysfs_notify(&sbi->s_kobj, NULL, "errors_count");
> + mutex_unlock(&sbi->s_error_notify_mutex);
> }
>
> static struct kobject *ext4_root;
> @@ -610,6 +613,7 @@ int ext4_register_sysfs(struct super_block *sb)
> int err;
>
> init_completion(&sbi->s_kobj_unregister);
> + mutex_init(&sbi->s_error_notify_mutex);
> err = kobject_init_and_add(&sbi->s_kobj, &ext4_sb_ktype, ext4_root,
> "%s", sb->s_id);
> if (err) {
> @@ -644,7 +648,10 @@ void ext4_unregister_sysfs(struct super_block *sb)
>
> if (sbi->s_proc)
> remove_proc_subtree(sb->s_id, ext4_proc_root);
> +
> + mutex_lock(&sbi->s_error_notify_mutex);
> kobject_del(&sbi->s_kobj);
> + mutex_unlock(&sbi->s_error_notify_mutex);
> }
>
> int __init ext4_init_sysfs(void)
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR