Re: [Devel] Re: [PATCH 10/10] Add support for multiple processes

From: Andrey Mirkin
Date: Thu Oct 30 2008 - 00:54:46 EST


On Monday 27 October 2008 18:58 Oren Laadan wrote:
> Andrey Mirkin wrote:
> > The whole tree of processes can be checkpointed and restarted now.
> > Shared objects are not supported yet.
> >
> > Signed-off-by: Andrey Mirkin <major@xxxxxxxxxx>
> > ---
> > checkpoint/cpt_image.h | 2 +
> > checkpoint/cpt_process.c | 24 +++++++++++++
> > checkpoint/rst_process.c | 85
> > +++++++++++++++++++++++++++------------------- 3 files changed, 76
> > insertions(+), 35 deletions(-)
> >
> > diff --git a/checkpoint/cpt_image.h b/checkpoint/cpt_image.h
> > index e1fb483..f370df2 100644
> > --- a/checkpoint/cpt_image.h
> > +++ b/checkpoint/cpt_image.h
> > @@ -128,6 +128,8 @@ struct cpt_task_image {
> > __u64 cpt_nivcsw;
> > __u64 cpt_min_flt;
> > __u64 cpt_maj_flt;
> > + __u32 cpt_children_num;
> > + __u32 cpt_pad;
> > } __attribute__ ((aligned (8)));
> >
> > struct cpt_mm_image {
> > diff --git a/checkpoint/cpt_process.c b/checkpoint/cpt_process.c
> > index 1f7a54b..d73ec3c 100644
> > --- a/checkpoint/cpt_process.c
> > +++ b/checkpoint/cpt_process.c
> > @@ -40,6 +40,19 @@ static unsigned int encode_task_flags(unsigned int
> > task_flags)
> >
> > }
> >
> > +static int cpt_count_children(struct task_struct *tsk, struct
> > cpt_context *ctx) +{
> > + int num = 0;
> > + struct task_struct *child;
> > +
> > + list_for_each_entry(child, &tsk->children, sibling) {
> > + if (child->parent != tsk)
> > + continue;
> > + num++;
> > + }
> > + return num;
> > +}
> > +
>
> I noticed that don't take the appropriate locks when browsing through
> tasks lists (siblings, children, global list). Although I realize that
> the container should be frozen at this time, I keep wondering if this
> is indeed always safe.
You right here. We need to take tasklist_lock to be sure that everything will
be consistent.

> For instance, are you protected against an OOM killer that might just
> occur uninvited and kill one of those tasks ?

OOM killer can't kill one of those tasks as all of them should be frozen at
that time and be in uninterruptible state. So, we can just do not think about
OOM at that time. But anyway you right and we need locking around browsing
tasks list.

> Can the administrator force an un-freeze of the container ? Or perhaps
> some error condition if the kernel cause that ?

The main idea is that context should be protected with a mutex and only one
process can do some activity (freeze, dump, unfreeze, kill) with a container.
Right now it is not implemented at all, but in future it will be added.

Andrey

> > int cpt_dump_task_struct(struct task_struct *tsk, struct cpt_context
> > *ctx) {
> > struct cpt_task_image *t;
> > @@ -102,6 +115,7 @@ int cpt_dump_task_struct(struct task_struct *tsk,
> > struct cpt_context *ctx) t->cpt_egid = tsk->egid;
> > t->cpt_sgid = tsk->sgid;
> > t->cpt_fsgid = tsk->fsgid;
> > + t->cpt_children_num = cpt_count_children(tsk, ctx);
> >
> > err = ctx->write(t, sizeof(*t), ctx);
> >
> > @@ -231,6 +245,16 @@ int cpt_dump_task(struct task_struct *tsk, struct
> > cpt_context *ctx) err = cpt_dump_fpustate(tsk, ctx);
> > if (!err)
> > err = cpt_dump_registers(tsk, ctx);
> > + if (!err) {
> > + struct task_struct *child;
> > + list_for_each_entry(child, &tsk->children, sibling) {
> > + if (child->parent != tsk)
> > + continue;
> > + err = cpt_dump_task(child, ctx);
> > + if (err)
> > + break;
>
> Here too.
>
> [...]
>
> Oren.
>
> _______________________________________________
> Containers mailing list
> Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
> https://lists.linux-foundation.org/mailman/listinfo/containers
>
> _______________________________________________
> Devel mailing list
> Devel@xxxxxxxxxx
> https://openvz.org/mailman/listinfo/devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/