Re: Time Namespaces: CLONE_NEWTIME and clone3()?

From: Eric W. Biederman
Date: Mon Feb 17 2020 - 21:40:06 EST


Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:

> Christian Brauner <christian.brauner@xxxxxxxxxx> writes:
>> On Mon, Feb 17, 2020 at 10:47:53PM +0100, Michael Kerrisk (man-pages) wrote:
>>> Actually, I think the alternative you propose just here is better. I
>>> imagine there are times when one will want to create multiple
>>> namespaces with a single call to clone3(), including a time namespace.
>>> I think this should be allowed by the API. And, otherwise, clone3()
>>> becomes something of a second-class citizen for creating namespaces.
>>> (I don't really get the "less invasive" argument. Implementing this is
>>> just a piece of kernel to code to make user-space's life a bit simpler
>>> and more consistent.)
>>
>> I don't particularly mind either way. If there's actual users that need
>> to set it at clone3() time then we can extend it. So I'd like to hear
>> what Adrian, Dmitry, and Thomas think since they are well-versed how
>> this will be used in the wild. I'm weary of exposing a whole new uapi
>> struct and extending clone3() without any real use-case but I'm happy to
>> if there is!
>
> I really have no clue. I merily helped getting this in shape without
> creating havoc for timekeeping and VDSO. I have to punt to the container
> wizards.

Short version. If you are going to do migration of a container with
CRIU you want the time namespace in your container. Possibly you can
avoid creating the time namespace until restore, but I don't think so.

Without the time namespace you get all kinds of applications that use
monotonic timers that will see their timers be ill behaved (probably
going backwards) over a checkpoint-restart event.

Eric