Re: [PATCH] fsnotify: fix inode reference leak in fsnotify_recalc_mask()
From: 尹欣
Date: Mon Apr 20 2026 - 03:24:09 EST
Cc Jan
Sorry, I used the wrong email address.
Thanks,
Xin Yin
> From: "Xin Yin"<yinxin.x@xxxxxxxxxxxxx>
> Date: Mon, Apr 20, 2026, 14:51
> Subject: [PATCH] fsnotify: fix inode reference leak in fsnotify_recalc_mask()
> To: <jan@xxxxxxx>, <amir73il@xxxxxxxxx>
> Cc: <linux-fsdevel@xxxxxxxxxxxxxxx>, <linux-kernel@xxxxxxxxxxxxxxx>, "Xin Yin"<yinxin.x@xxxxxxxxxxxxx>
> fsnotify_recalc_mask() fails to handle the return value of
> __fsnotify_recalc_mask(), which may return an inode pointer that needs
> to be released via fsnotify_drop_object() when the connector's HAS_IREF
> flag transitions from set to cleared.
>
> This manifests as a hung task with the following call trace:
>
> INFO: task umount:1234 blocked for more than 120 seconds.
> Call Trace:
> __schedule
> schedule
> fsnotify_sb_delete
> generic_shutdown_super
> kill_anon_super
> cleanup_mnt
> task_work_run
> do_exit
> do_group_exit
>
> The race window that triggers the iref leak:
>
> Thread A (adding mark) Thread B (removing mark)
> ────────────────────── ────────────────────────
> fsnotify_add_mark_locked():
> fsnotify_add_mark_list():
> spin_lock(conn->lock)
> add mark_B(evictable) to list
> spin_unlock(conn->lock)
> return
>
> /* ---- gap: no lock held ---- */
>
> fsnotify_detach_mark(mark_A):
> spin_lock(mark_A->lock)
> clear ATTACHED flag on mark_A
> spin_unlock(mark_A->lock)
>
> fsnotify_recalc_mask():
> spin_lock(conn->lock)
> __fsnotify_recalc_mask():
> /* mark_A skipped: ATTACHED cleared */
> /* only mark_B(evictable) remains */
> want_iref = false
> has_iref = true /* not yet cleared */
> -> HAS_IREF transitions true -> false
> -> returns inode pointer
> spin_unlock(conn->lock)
> /* BUG: return value discarded!
> * iput() and fsnotify_put_sb_watched_objects()
> * are never called */
>
> Fix this by capturing the return value of __fsnotify_recalc_mask() and
> passing it to fsnotify_drop_object() after releasing the spinlock, which
> is the same pattern used in fsnotify_put_mark().
>
> Fixes: c3638b5b1374 ("fsnotify: allow adding an inode mark without pinning inode")
> Signed-off-by: Xin Yin <yinxin.x@xxxxxxxxxxxxx>
> ---
> fs/notify/mark.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index c2ed5b11b0fe6..cc93fcc2c5a9c 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -283,6 +283,8 @@ static void fsnotify_conn_set_children_dentry_flags(
> fsnotify_set_children_dentry_flags(fsnotify_conn_inode(conn));
> }
>
> +static void fsnotify_drop_object(unsigned int type, void *objp);
> +
> /*
> * Calculate mask of events for a list of marks. The caller must make sure
> * connector and connector->obj cannot disappear under us. Callers achieve
> @@ -292,15 +294,19 @@ static void fsnotify_conn_set_children_dentry_flags(
> void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
> {
> bool update_children;
> + unsigned int type;
> + void *objp;
>
> if (!conn)
> return;
>
> spin_lock(&conn->lock);
> update_children = !fsnotify_conn_watches_children(conn);
> - __fsnotify_recalc_mask(conn);
> + objp = __fsnotify_recalc_mask(conn);
> + type = conn->type;
> update_children &= fsnotify_conn_watches_children(conn);
> spin_unlock(&conn->lock);
> + fsnotify_drop_object(type, objp);
> /*
> * Set children's PARENT_WATCHED flags only if parent started watching.
> * When parent stops watching, we clear false positive PARENT_WATCHED
> --
> 2.20.1
>