Re: clone3: allow creation of time namespace with offset

From: Christian Brauner
Date: Tue Mar 17 2020 - 12:09:53 EST


On Wed, Mar 18, 2020 at 01:23:50AM +1100, Aleksa Sarai wrote:
> On 2020-03-17, Michael Kerrisk (man-pages) <mtk.manpages@xxxxxxxxx> wrote:
> > [CC += linux-api; please CC on future versions]
> >
> > On Tue, 17 Mar 2020 at 09:32, Adrian Reber <areber@xxxxxxxxxx> wrote:
> > > Requiring nanoseconds as well as seconds for two clocks during clone3()
> > > means that it would require 4 additional members to 'struct clone_args':
> > >
> > > __aligned_u64 tls;
> > > __aligned_u64 set_tid;
> > > __aligned_u64 set_tid_size;
> > > + __aligned_u64 boottime_offset_seconds;
> > > + __aligned_u64 boottime_offset_nanoseconds;
> > > + __aligned_u64 monotonic_offset_seconds;
> > > + __aligned_u64 monotonic_offset_nanoseconds;
> > > };
> > >
> > > To avoid four additional members to 'struct clone_args' this patchset
> > > uses another approach:
> > >
> > > __aligned_u64 tls;
> > > __aligned_u64 set_tid;
> > > __aligned_u64 set_tid_size;
> > > + __aligned_u64 timens_offset;
> > > + __aligned_u64 timens_offset_size;
> > > };
> > >
> > > timens_offset is a pointer to an array just as previously done with
> > > set_tid and timens_offset_size is the size of the array.
> > >
> > > The timens_offset array is expected to contain a struct like this:
> > >
> > > struct set_timens_offset {
> > > int clockid;
> > > struct timespec val;
> > > };
> > >
> > > This way it is possible to pass the information of multiple clocks with
> > > seconds and nanonseconds to clone3().
> > >
> > > To me this seems the better approach, but I am not totally convinced
> > > that it is the right thing. If there are other ideas how to pass two
> > > clock offsets with seconds and nanonseconds to clone3() I would be happy
> > > to hear other ideas.
>
> While I agree this does make the API cleaner, I am a little worried that
> it risks killing some of the ideas we discussed for seccomp deep
> inspection. In particular, having a pointer to variable-sized data
> inside the struct means that now the cBPF program can't just be given a
> copy of the struct data from userspace to check.

I suggested two alternative approaches in a response to this. The
easiest one would be to simple assume that the struct doesn't change
size.
(But haven't we crossed that bridge with the set_tid array already?)