Re: [PATCH 2/4] clone3: allow creation of time namespace with offset

From: Christian Brauner
Date: Wed Mar 18 2020 - 08:13:33 EST


On Tue, Mar 17, 2020 at 09:30:42AM +0100, Adrian Reber wrote:
> This extends clone3() to support the time namespace via CLONE_NEWTIME.
> In addition to creating a new process in a new time namespace this
> allows setting the clock offset in the newly created time namspace.
>
> The time namespace allows to set an offset for two clocks.
> CLOCK_MONOTONIC and CLOCK_BOOTTIME.
>
> This clone3() extension also offers setting both offsets through the
> newly introduced clone_args members timens_offset and
> timens_offset_size.
>
> timens_offset: Pointer to an array of clock offsets for the
> newly created process in a time namespaces.
> This requires that a new time namespace has been
> requested via CLONE_NEWTIME. It is only possible
> to set an offset for CLOCK_MONOTONIC and
> CLOCK_BOOTTIME. The array can therefore never
> have more than two elements.
> clone3() expects the array to contain the
> following struct:
> struct set_timens_offset {
> int clockid;
> struct timespec val;
> };
>
> timens_offset_size: This defines the size of the array referenced
> in timens_offset. Currently this is limited
> to two elements.
>
> To create a new process using clone3() in a new time namespace with
> clock offsets, something like this can be used:
>
> struct set_timens_offset timens_offset[2];
>
> timens_offset[0].clockid = CLOCK_BOOTTIME;
> timens_offset[0].val.tv_sec = -1000;
> timens_offset[0].val.tv_nsec = 42;
> timens_offset[1].clockid = CLOCK_MONOTONIC;
> timens_offset[1].val.tv_sec = 1000000;
> timens_offset[1].val.tv_nsec = 37;
>
> struct _clone_args args = {
> .flags = CLONE_NEWTIME,
> .timens_offset = ptr_to_u64(timens_offset),
> .timens_offset_size = 2;
> };

In all honesty, this would be a terrible API and I think we need to come
up with something better than this. I don't want to pass down an array
of structs and in general would like to avoid this array + size pattern.
That pattern kinda made sense for the pid array because of pid
namespaces being nested but not for this case, I think. Also, why
require the additional clockid argument here? That makes sense for
clock_settime() and clock_gettime() but here we could just do:

struct timens {
struct timespec clock_bootime;
struct timespec clock_monotonic;
};

no? And since you need to expose that struct in a header somewhere
anyway you can version it by size just like clone_args. So the kernel
can apply the same pattern to be backwards compatible that we have with
struct clone_args and for openat2()'s struct open_how via
copy_struct_from_user. Then you only need one additional pointer in
struct clone_args.

What do we think?

Christian