Re: [PATCH 2/6] arm64/vdso: Zap vvar pages when switching to a time namespace

From: Mark Rutland
Date: Tue Jun 16 2020 - 07:24:26 EST


On Tue, Jun 16, 2020 at 12:55:41AM -0700, Andrei Vagin wrote:
> The VVAR page layout depends on whether a task belongs to the root or
> non-root time namespace.

Please be more explicit as to what you mean by `layout` here, as that seems to
be an overloaded term. For example, the comment above timens_setup_vdso_data()
can be read to directly contradict this:

| A time namespace VVAR page has the same layout as the VVAR page which
| contains the system wide VDSO data.

... I think you're trying to say is that when we add time namespace support,
we'll have multiple VVAR pages, and their position in the address space depends
on whether the task is part of a time namespace.

> Whenever a task changes its namespace, the VVAR
> page tables are cleared and then they will be re-faulted with a
> corresponding layout.

How does this work for multi-threaded applications? Are there any
concerns w.r.t. atomicity of the change?

> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
> Reviewed-by: Dmitry Safonov <dima@xxxxxxxxxx>
> Signed-off-by: Andrei Vagin <avagin@xxxxxxxxx>
> ---
> arch/arm64/kernel/vdso.c | 32 ++++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
> index b0aec4e8c9b4..df4bb736d28a 100644
> --- a/arch/arm64/kernel/vdso.c
> +++ b/arch/arm64/kernel/vdso.c
> @@ -125,6 +125,38 @@ static int __vdso_init(enum vdso_abi abi)
> return 0;
> }
>
> +#ifdef CONFIG_TIME_NS
> +/*
> + * The vvar page layout depends on whether a task belongs to the root or
> + * non-root time namespace. Whenever a task changes its namespace, the VVAR
> + * page tables are cleared and then they will re-faulted with a
> + * corresponding layout.
> + * See also the comment near timens_setup_vdso_data() for details.
> + */

As with the commit message, this is not very clear and can be read to
contradict the comment it refers to, which is rather unhelpful.

How about:

/*
* The vvar mapping contains data for a specific time namespace, so when
* a task changes namespace we must unmap its vvar data for the old
* namespace. Subsequent faults will map in data for the new namespace.
*
* For more details see timens_setup_vdso_data().
*/

Thanks,
Mark.

> +int vdso_join_timens(struct task_struct *task, struct time_namespace *ns)
> +{
> + struct mm_struct *mm = task->mm;
> + struct vm_area_struct *vma;
> +
> + if (mmap_write_lock_killable(mm))
> + return -EINTR;
> +
> + for (vma = mm->mmap; vma; vma = vma->vm_next) {
> + unsigned long size = vma->vm_end - vma->vm_start;
> +
> + if (vma_is_special_mapping(vma, vdso_info[VDSO_ABI_AA64].dm))
> + zap_page_range(vma, vma->vm_start, size);
> +#ifdef CONFIG_COMPAT_VDSO
> + if (vma_is_special_mapping(vma, vdso_info[VDSO_ABI_AA32].dm))
> + zap_page_range(vma, vma->vm_start, size);
> +#endif
> + }
> +
> + mmap_write_unlock(mm);
> + return 0;
> +}
> +#endif
> +
> static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
> struct vm_area_struct *vma, struct vm_fault *vmf)
> {
> --
> 2.24.1
>