Re: [PATCH] fs: handle shrinker registration failure in sget_userns

From: Jan Kara
Date: Thu Nov 23 2017 - 07:56:32 EST


On Thu 23-11-17 13:53:45, Michal Hocko wrote:
> On Thu 23-11-17 13:45:41, Michal Hocko wrote:
> [...]
> > What about the following?
>
> Dohh, a rebase artifact sneaked in. So let me try again. Sorry about
> spamming :/
> ---
> From 467be16ca5165613daf292a68592e3b5bc7252c5 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@xxxxxxxx>
> Date: Thu, 23 Nov 2017 12:28:35 +0100
> Subject: [PATCH] fs: handle shrinker registration failure in sget_userns
>
> Syzbot has reported NULL ptr dereference during mntput because of
> sb shrinker being NULL
> CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> task: ffff8801d1dbe5c0 task.stack: ffff8801c9e38000
> RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51
> RSP: 0018:ffff8801c9e3f108 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff8801c53c6f98 RDI: ffff8801c53c6fa0
> RBP: ffff8801c9e3f120 R08: 1ffff100393c7d55 R09: 0000000000000004
> R10: ffff8801c9e3ef70 R11: 0000000000000000 R12: 0000000000000000
> R13: dffffc0000000000 R14: 1ffff100393c7e45 R15: ffff8801c53c6f98
> FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
> CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00000000dbc23000 CR3: 00000001c7269000 CR4: 00000000001406e0
> DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Call Trace:
> __list_del_entry include/linux/list.h:117 [inline]
> list_del include/linux/list.h:125 [inline]
> unregister_shrinker+0x79/0x300 mm/vmscan.c:301
> deactivate_locked_super+0x64/0xd0 fs/super.c:308
> deactivate_super+0x141/0x1b0 fs/super.c:340
> cleanup_mnt+0xb2/0x150 fs/namespace.c:1173
> mntput_no_expire+0x6e0/0xa90 fs/namespace.c:1237
> mntput fs/namespace.c:1247 [inline]
> kern_unmount+0x9c/0xd0 fs/namespace.c:2999
> mq_put_mnt+0x37/0x50 ipc/mqueue.c:1609
> put_ipc_ns+0x4d/0x150 ipc/namespace.c:163
> free_nsproxy+0xc0/0x1f0 kernel/nsproxy.c:180
> switch_task_namespaces+0x9d/0xc0 kernel/nsproxy.c:229
> exit_task_namespaces+0x17/0x20 kernel/nsproxy.c:234
> do_exit+0x9b0/0x1ad0 kernel/exit.c:864
> do_group_exit+0x149/0x400 kernel/exit.c:968
>
> Tetsuo has properly pointed out that the real reason is that fault
> injection has caused register_shrinker to fail and the error path is not
> handled in sget_userns.
>
> Fix the issue by moving the shrinker registration up when the superblock
> is allocated and fail early even before we try to register the superblock.
> This should be safe wrt. parallel shrinker invocation as we are holding
> s_umount lock which blocks shrinker invocation.
>
> The issue is very unlikely to trigger in the production because small
> allocations do not fail usually.
>
> Debugged-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>

Looks good to me now. You can add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza

> ---
> fs/super.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index d4e33e8f1e6f..80b118cc4bb6 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -155,11 +155,19 @@ static void destroy_super_rcu(struct rcu_head *head)
> schedule_work(&s->destroy_work);
> }
>
> -/* Free a superblock that has never been seen by anyone */
> +/*
> + * Free a superblock that has never been seen by anyone. Note that shrinkers
> + * could have been invoked already but we rely on s_umount to not actually
> + * touch it.
> + */
> static void destroy_unused_super(struct super_block *s)
> {
> if (!s)
> return;
> +
> + if (!list_empty(&s->s_shrink.list))
> + unregister_shrinker(&s->s_shrink);
> +
> up_write(&s->s_umount);
> list_lru_destroy(&s->s_dentry_lru);
> list_lru_destroy(&s->s_inode_lru);
> @@ -252,6 +260,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> s->s_shrink.count_objects = super_cache_count;
> s->s_shrink.batch = 1024;
> s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
> + INIT_LIST_HEAD(&s->s_shrink.list);
> return s;
>
> fail:
> @@ -503,6 +512,10 @@ struct super_block *sget_userns(struct file_system_type *type,
> s = alloc_super(type, (flags & ~SB_SUBMOUNT), user_ns);
> if (!s)
> return ERR_PTR(-ENOMEM);
> + if (register_shrinker(&s->s_shrink)) {
> + destroy_unused_super(s);
> + return ERR_PTR(-ENOMEM);
> + }
> goto retry;
> }
>
> @@ -518,7 +531,6 @@ struct super_block *sget_userns(struct file_system_type *type,
> hlist_add_head(&s->s_instances, &type->fs_supers);
> spin_unlock(&sb_lock);
> get_filesystem(type);
> - register_shrinker(&s->s_shrink);
> return s;
> }
>
> --
> 2.15.0
>
> --
> Michal Hocko
> SUSE Labs
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR