[PATCH v2 1/4] timens: make vdso_join_timens() always succeed
From: Christian Brauner
Date: Mon Jul 06 2020 - 11:49:26 EST
As discussed on-list (cf. [1]), in order to make setns() support time
namespaces when attaching to multiple namespaces at once properly we
need to tweak vdso_join_timens() to always succeed. So switch
vdso_join_timens() to using a read lock and replacing
mmap_write_lock_killable() to mmap_read_lock() as we discussed.
Last cycle setns() was changed to support attaching to multiple namespaces
atomically. This requires all namespaces to have a point of no return where
they can't fail anymore. Specifically, <namespace-type>_install() is
allowed to perform permission checks and install the namespace into the new
struct nsset that it has been given but it is not allowed to make visible
changes to the affected task. Once <namespace-type>_install() returns
anything that the given namespace type requires to be setup in addition
needs to ideally be done in a function that can't fail or if it fails the
failure is not fatal. For time namespaces the relevant functions that fall
into this category are timens_set_vvar_page() and vdso_join_timens().
Currently the latter can fail but doesn't need to. With this we can go on
to implement a timens_commit() helper in a follow up patch to be used by
setns().
[1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein
Signed-off-by: Christian Brauner <christian.brauner@xxxxxxxxxx>
Reviewed-by: Andrei Vagin <avagin@xxxxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
Cc: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Dmitry Safonov <dima@xxxxxxxxxx>
Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
Link: https://lore.kernel.org/r/20200619153559.724863-2-christian.brauner@xxxxxxxxxx
Signed-off-by: Christian Brauner <christian.brauner@xxxxxxxxxx>
---
arch/x86/entry/vdso/vma.c | 5 ++---
kernel/time/namespace.c | 10 ++--------
2 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index ea7c1f0b79df..9185cb1d13b9 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -144,8 +144,7 @@ int vdso_join_timens(struct task_struct *task, struct time_namespace *ns)
struct mm_struct *mm = task->mm;
struct vm_area_struct *vma;
- if (mmap_write_lock_killable(mm))
- return -EINTR;
+ mmap_read_lock(mm);
for (vma = mm->mmap; vma; vma = vma->vm_next) {
unsigned long size = vma->vm_end - vma->vm_start;
@@ -154,7 +153,7 @@ int vdso_join_timens(struct task_struct *task, struct time_namespace *ns)
zap_page_range(vma, vma->vm_start, size);
}
- mmap_write_unlock(mm);
+ mmap_read_unlock(mm);
return 0;
}
#else
diff --git a/kernel/time/namespace.c b/kernel/time/namespace.c
index 5d9fc22d836a..e5af6fe87af8 100644
--- a/kernel/time/namespace.c
+++ b/kernel/time/namespace.c
@@ -284,7 +284,6 @@ static int timens_install(struct nsset *nsset, struct ns_common *new)
{
struct nsproxy *nsproxy = nsset->nsproxy;
struct time_namespace *ns = to_time_ns(new);
- int err;
if (!current_is_single_threaded())
return -EUSERS;
@@ -295,9 +294,7 @@ static int timens_install(struct nsset *nsset, struct ns_common *new)
timens_set_vvar_page(current, ns);
- err = vdso_join_timens(current, ns);
- if (err)
- return err;
+ vdso_join_timens(current, ns);
get_time_ns(ns);
put_time_ns(nsproxy->time_ns);
@@ -313,7 +310,6 @@ int timens_on_fork(struct nsproxy *nsproxy, struct task_struct *tsk)
{
struct ns_common *nsc = &nsproxy->time_ns_for_children->ns;
struct time_namespace *ns = to_time_ns(nsc);
- int err;
/* create_new_namespaces() already incremented the ref counter */
if (nsproxy->time_ns == nsproxy->time_ns_for_children)
@@ -321,9 +317,7 @@ int timens_on_fork(struct nsproxy *nsproxy, struct task_struct *tsk)
timens_set_vvar_page(tsk, ns);
- err = vdso_join_timens(tsk, ns);
- if (err)
- return err;
+ vdso_join_timens(tsk, ns);
get_time_ns(ns);
put_time_ns(nsproxy->time_ns);
--
2.27.0