Re: [PATCH 2/6] arm64/vdso: Zap vvar pages when switching to a time namespace

From: Andrei Vagin
Date: Tue Jun 23 2020 - 03:33:11 EST


On Fri, Jun 19, 2020 at 05:38:12PM +0200, Christian Brauner wrote:
> On Tue, Jun 16, 2020 at 12:55:41AM -0700, Andrei Vagin wrote:
> > The VVAR page layout depends on whether a task belongs to the root or
> > non-root time namespace. Whenever a task changes its namespace, the VVAR
> > page tables are cleared and then they will be re-faulted with a
> > corresponding layout.
> >
> > Reviewed-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
> > Reviewed-by: Dmitry Safonov <dima@xxxxxxxxxx>
> > Signed-off-by: Andrei Vagin <avagin@xxxxxxxxx>
> > ---
> > arch/arm64/kernel/vdso.c | 32 ++++++++++++++++++++++++++++++++
> > 1 file changed, 32 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
> > index b0aec4e8c9b4..df4bb736d28a 100644
> > --- a/arch/arm64/kernel/vdso.c
> > +++ b/arch/arm64/kernel/vdso.c
> > @@ -125,6 +125,38 @@ static int __vdso_init(enum vdso_abi abi)
> > return 0;
> > }
> >
> > +#ifdef CONFIG_TIME_NS
> > +/*
> > + * The vvar page layout depends on whether a task belongs to the root or
> > + * non-root time namespace. Whenever a task changes its namespace, the VVAR
> > + * page tables are cleared and then they will re-faulted with a
> > + * corresponding layout.
> > + * See also the comment near timens_setup_vdso_data() for details.
> > + */
> > +int vdso_join_timens(struct task_struct *task, struct time_namespace *ns)
> > +{
> > + struct mm_struct *mm = task->mm;
> > + struct vm_area_struct *vma;
> > +
> > + if (mmap_write_lock_killable(mm))
> > + return -EINTR;
>
> Hey,
>
> Just a heads-up I'm about to plumb CLONE_NEWTIME support into setns()

Hmm. I am not sure that I unserstand what you mean. I think setns(nsfd,
CLONE_NEWTIME) works now. For example, we use it in
tools/testing/selftests/timens/timens.c. Do you mean setns(pidfd,
CLONE_NEWTIME | CLONE_something)?

> which would mean that vdso_join_timens() ould not be allowed to fail
> anymore to make it easy to switch to multiple namespaces atomically. So
> this would probably need to be changed to mmap_write_lock() which I've
> already brought up upstream:
> https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein/
> (Assuming that people agree. I just sent the series and most people here
> are Cced.)
>
> Thanks!
> Christian