Re: [RFC v2][PATCH 2/9] General infrastructure for checkpointrestart

From: Louis Rilling
Date: Thu Aug 21 2008 - 05:35:43 EST


On Wed, Aug 20, 2008 at 11:04:13PM -0400, Oren Laadan wrote:
>
> Add those interfaces, as well as helpers needed to easily manage the
> file format. The code is roughly broken out as follows:
>
> ckpt/sys.c - user/kernel data transfer, as well as setup of the
> checkpoint/restart context (a per-checkpoint data structure for
> housekeeping)
>
> ckpt/checkpoint.c - output wrappers and basic checkpoint handling
>
> ckpt/restart.c - input wrappers and basic restart handling
>
> Patches to add the per-architecture support as well as the actual
> work to do the memory checkpoint follow in subsequent patches.
>

[...]

> diff --git a/checkpoint/sys.c b/checkpoint/sys.c
> new file mode 100644
> index 0000000..2891c48
> --- /dev/null
> +++ b/checkpoint/sys.c

[...]

> +/*
> + * helpers to manage CR contexts: allocated for each checkpoint and/or
> + * restart operation, and persists until the operation is completed.
> + */
> +
> +static atomic_t cr_ctx_count; /* unique checkpoint identifier */

I thought we agreed that this counter should be per-container. Perhaps add a
TODO here?

> +
> +void cr_ctx_free(struct cr_ctx *ctx)
> +{
> +
> + if (ctx->file)
> + fput(ctx->file);
> + if (ctx->vfsroot)
> + path_put(ctx->vfsroot);
> +
> + free_pages((unsigned long) ctx->tbuf, CR_TBUF_ORDER);
> + free_pages((unsigned long) ctx->hbuf, CR_HBUF_ORDER);
> +
> + kfree(ctx);
> +}
> +
> +struct cr_ctx *cr_ctx_alloc(pid_t pid, struct file *file, unsigned long flags)
> +{
> + struct cr_ctx *ctx;
> +
> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> + if (!ctx)
> + return NULL;
> +
> + ctx->tbuf = (void *) __get_free_pages(GFP_KERNEL, CR_TBUF_ORDER);
> + ctx->hbuf = (void *) __get_free_pages(GFP_KERNEL, CR_HBUF_ORDER);
> + if (!ctx->tbuf || !ctx->hbuf)
> + goto nomem;
> +
> + ctx->pid = pid;
> + ctx->flags = flags;
> +
> + ctx->file = file;
> + get_file(file);
> +
> + /* assume checkpointer is in container's root vfs */

I'm a bit puzzled by this assumption. I would say: either this is a
self-checkpoint (only current process), or this is a container checkpoint. In
the latter case, I expect that in the general case the checkpointer lives
outside the container (and the interface of sys_checkpoint() below confirms
this), so it's root fs is probably not the container's one.

Does it differ from what you're planning?

Thanks,

Louis

> + ctx->vfsroot = &current->fs->root;
> + path_get(ctx->vfsroot);
> +
> + ctx->crid = atomic_inc_return(&cr_ctx_count);
> +
> + return ctx;
> +
> + nomem:
> + cr_ctx_free(ctx);
> + return NULL;
> +}
> +
> +/**
> + * sys_checkpoint - checkpoint a container
> + * @pid: pid of the container init(1) process
> + * @fd: file to which dump the checkpoint image
> + * @flags: checkpoint operation flags
> + */
> +asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
> +{
> + struct cr_ctx *ctx;
> + struct file *file;
> + int fput_needed;
> + int ret;
> +
> + file = fget_light(fd, &fput_needed);
> + if (!file)
> + return -EBADF;
> +
> + /* no flags for now */
> + if (flags)
> + return -EINVAL;
> +
> + ctx = cr_ctx_alloc(pid, file, flags | CR_CTX_CKPT);
> + if (!ctx) {
> + fput_light(file, fput_needed);
> + return -ENOMEM;
> + }
> +
> + ret = do_checkpoint(ctx);
> +
> + cr_ctx_free(ctx);
> + fput_light(file, fput_needed);
> +
> + return ret;
> +}
> +
> +/**
> + * sys_restart - restart a container
> + * @crid: checkpoint image identifier
> + * @fd: file from which read the checkpoint image
> + * @flags: restart operation flags
> + */
> +asmlinkage long sys_restart(int crid, int fd, unsigned long flags)
> +{
> + struct cr_ctx *ctx;
> + struct file *file;
> + int fput_needed;
> + int ret;
> +
> + file = fget_light(fd, &fput_needed);
> + if (!file)
> + return -EBADF;
> +
> + /* no flags for now */
> + if (flags)
> + return -EINVAL;
> +
> + ctx = cr_ctx_alloc(crid, file, flags | CR_CTX_RSTR);
> + if (!ctx) {
> + fput_light(file, fput_needed);
> + return -ENOMEM;
> + }
> +
> + ret = do_restart(ctx);
> +
> + cr_ctx_free(ctx);
> + fput_light(file, fput_needed);
> +
> + return ret;
> +}

--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes

Attachment: signature.asc
Description: Digital signature