Re: [PATCH] exit: move exit_task_namespaces() after exit_task_work()
From: Cong Wang
Date: Fri Dec 15 2017 - 19:00:25 EST
On Thu, Dec 14, 2017 at 1:08 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Dec 14, 2017 at 12:17:57PM -0800, Cong Wang wrote:
>> syzbot reported we have a use-after-free when mqueue_evict_inode()
>> is called on __cleanup_mnt() path, where the ipc ns is already
>> freed by the previous exit_task_namespaces(). We can just move
>> it after after exit_task_work() to avoid this use-after-free.
>
> What's to prevent somebody else holding a reference to the same
> inode past the exit(2)? IOW, I don't believe that this is fixing
> anything - in the best case, your patch papers over a specific
> reproducer.
You are right, I missed mq_clear_sbinfo().
And the offending commit is:
commit 9c583773d036336176e9e50441890659bc4eeae8
Author: Giuseppe Scrivano <gscrivan@xxxxxxxxxx>
Date: Fri Dec 15 01:06:28 2017 +0000
ipc, mqueue: lazy call kern_mount_data in new namespaces
kern_mount_data is a relatively expensive operation when creating a new
IPC namespace, so delay the mount until its first usage when not creating
the the global namespace.
This is a net saving for new IPC namespaces that don't use mq_open(). In
this case there won't be any kern_mount_data() cost at all.
On my machine, the time for creating 1000 new IPC namespaces dropped from
~8s to ~2s.
Link: http://lkml.kernel.org/r/20171206151422.9660-1-gscrivan@xxxxxxxxxx
Signed-off-by: Giuseppe Scrivano <gscrivan@xxxxxxxxxx>
Cc: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>