Re: cgroup: rmdir() does not complete

From: Mark Hills
Date: Mon Aug 30 2010 - 05:13:24 EST


On Fri, 27 Aug 2010, KAMEZAWA Hiroyuki wrote:

> On Fri, 27 Aug 2010 12:39:48 +0900
> Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> wrote:
>
> > On Fri, 27 Aug 2010 11:35:06 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> >
> > > On Fri, 27 Aug 2010 09:56:39 +0900
> > > Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> wrote:
> > >
> > > > > Or is it likely to be some other cause, and how best to find it?
> > > > >
> > > > What cgroup subsystem did you mount where the directory existed you tried
> > > > to rmdir() first ?
> > > > If you mounted several subsystems on the same hierarchy, can you mount them
> > > > separately to narrow down the cause ?
> > > >
> > >
> > > It seems I can reproduce the issue on mmotm-0811, too.
> > >
> > > try this.
> > >
> > > Here, memory cgroup is mounted at /cgroups.
> > > ==
> > > #!/bin/bash -x
> > >
> > > while sleep 1; do
> > > date
> > > mkdir /cgroups/test
> > > echo 0 > /cgroups/test/tasks
> > > echo 300M > /cgroups/test/memory.limit_in_bytes
> > > cat /proc/self/cgroup
> > > dd if=/dev/zero of=./tmpfile bs=4096 count=100000
> > > echo 0 > /cgroups/tasks
> > > cat /proc/self/cgroup
> > > rmdir /cgroups/test
> > > rm ./tmpfile
> > > done
> > > ==
> > >
> > > hangs at rmdir. I'm no investigating force_empty.
> > >
> > Thank you very much for your information.
> >
> > Some questions.
> >
> > Is "tmpfile" created on a normal filesystem(e.g. ext3) or tmpfs ?
> on ext4.
>
> > And, how long does it likely to take to cause this problem ?
>
> very soon. 10-20 loop.

The test case I was running is similar to the above. With the Lustre
filesystem the problem takes 4 hours or more to show itself. Recently I
ran 4 threads for over 24 hours without it being seen -- I suspect some
external factor is involved.

I also tried NFS, and did not see a problem after 8 hours or so, but this
is inconclusive.

The use of the Fedora kernel, and the Lustre filesystem is not
satisfactory to trace the bug. Until I can get a test case which is more
readily reproducable, I'm not able to reasonably think about changing
variables.

It is interesting you see the problem so readily on ext4; I will test that
soon (it is currently holiday weekend in the UK). I hope it will give me
the test case I am looking for.

Thanks

--
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/