Re: cgroup: rmdir() does not complete

From: KAMEZAWA Hiroyuki
Date: Thu Sep 09 2010 - 19:49:00 EST


On Fri, 10 Sep 2010 00:04:31 +0100 (BST)
Mark Hills <mark@xxxxxxxxxxx> wrote:

> On Thu, 9 Sep 2010, Peter Zijlstra wrote:
>
> > On Thu, 2010-09-09 at 12:36 +0100, Mark Hills wrote:
> >
> > > I am still finding the problem incredibly hard to reproduce, so I'd like
> > > to observe as much data as possible from the current case before
> > > rebooting. If I could capture some kind of stack trace in the kernel for
> > > the running process that would be great, any suggestions appreciated.
> >
> > echo l > /proc/sysrq-trigger
>
> Despite running this many times, I never 'catch' the process on a CPU,
> despite it using 70% in top. But...
>
> > another thing you can do is run something like: perf record -gp $pid
> > which will give you a profile of that task.
>
> This is very useful, thanks.
>
> The report on the spinning process (23586) is dominated by calls from
> mem_cgroup_force_empty.
>
> It seems to show lru_add_drain_all and drain_all_stock_sync are causing
> the load (I assume drain_all_stock_sync has been optimised out). But I
> don't think this is as important as what causes the spin.
>
> There are no tasks in the cgroup, but memory usage is non-zero and
> constant. It seems mem_cgroup_force_empty is unable to empty the cgroup in
> this case.
>
> # cat /cgroup/soaked-23586/tasks
> # cat /cgroup/soaked-23586/memory.usage_in_bytes
> 24576
> # cat /cgroup/soaked-23586/memsw.usage_in_bytes
> <hangs>
>
I think this "cat" hang is because of vfs's lock.

Hmm, then, there are pages on LRU which cannot be moved or there is
leak of account.

BTW, mem_cgroup's rmdir is desgined to be able to receive SIGINT etc...
Can't you stop rmdir by Ctrl-C or some ?

rmdir -> hang -> Ctrl-C (or some) -> cat .../memory.stat

can work ? And do you still use Fedora's kernel ?

Thanks,
-Kame

> Here are the first few entries from the perf output, I can provide the
> rest if needed, but all result from mem_cgroup_force_empty.
>
> 8.13% :23586 [kernel] [k] _raw_spin_lock_irqsave
> |
> --- _raw_spin_lock_irqsave
> |
> |--45.14%-- probe_workqueue_insertion
> | insert_work
> | |
> | |--99.09%-- __queue_work
> | | queue_work_on
> | | schedule_work_on
> | | schedule_on_each_cpu
> | | |
> | | |--50.59%-- lru_add_drain_all
> | | | mem_cgroup_force_empty
> | | | mem_cgroup_pre_destroy
> | | | cgroup_rmdir
> | | | vfs_rmdir
> | | | do_rmdir
> | | | sys_rmdir
> | | | system_call_fastpath
> | | | 0x3f504d27d7
> | | | 0x405687
> | | | 0x406ef0
> | | | 0x402f31
> | | | 0x3f5041eb1d
> | | |
> | | --49.41%-- mem_cgroup_force_empty
> | | mem_cgroup_pre_destroy
> | | cgroup_rmdir
> | | vfs_rmdir
> | | do_rmdir
> | | sys_rmdir
> | | system_call_fastpath
> | | 0x3f504d27d7
> | | 0x405687
> | | 0x406ef0
> | | 0x402f31
> | | 0x3f5041eb1d
> | --0.91%-- [...]
> |
> |--22.92%-- mem_cgroup_force_empty
> | mem_cgroup_pre_destroy
> | cgroup_rmdir
> | vfs_rmdir
> | do_rmdir
> | sys_rmdir
> | system_call_fastpath
> | 0x3f504d27d7
> | 0x405687
> | 0x406ef0
> | 0x402f31
> | 0x3f5041eb1d
> |
> |--8.17%-- __queue_work
> | queue_work_on
> | schedule_work_on
> | schedule_on_each_cpu
> | |
> | |--52.09%-- lru_add_drain_all
> | | mem_cgroup_force_empty
> | | mem_cgroup_pre_destroy
> | | cgroup_rmdir
> | | vfs_rmdir
> | | do_rmdir
> | | sys_rmdir
> | | system_call_fastpath
> | | 0x3f504d27d7
> | | 0x405687
> | | 0x406ef0
> | | 0x402f31
> | | 0x3f5041eb1d
> | |
> | --47.91%-- mem_cgroup_force_empty
> | mem_cgroup_pre_destroy
> | cgroup_rmdir
> | vfs_rmdir
> | do_rmdir
> | sys_rmdir
> | system_call_fastpath
> | 0x3f504d27d7
> | 0x405687
> | 0x406ef0
> | 0x402f31
> | 0x3f5041eb1d
> |
> |--7.94%-- __wake_up
> | |
> | |--99.71%-- insert_work
> | | |
> | | |--97.70%-- __queue_work
> | | | queue_work_on
> | | | schedule_work_on
> | | | schedule_on_each_cpu
> | | | |
> | | | |--50.59%-- mem_cgroup_force_empty
> | | | | mem_cgroup_pre_destroy
> | | | | cgroup_rmdir
> | | | | vfs_rmdir
> | | | | do_rmdir
> | | | | sys_rmdir
> | | | | system_call_fastpath
> | | | | 0x3f504d27d7
> | | | | 0x405687
> | | | | 0x406ef0
> | | | | 0x402f31
> | | | | 0x3f5041eb1d
> | | | |
> | | | --49.41%-- lru_add_drain_all
> | | | mem_cgroup_force_empty
> | | | mem_cgroup_pre_destroy
> | | | cgroup_rmdir
> | | | vfs_rmdir
> | | | do_rmdir
> | | | sys_rmdir
> | | | system_call_fastpath
> | | | 0x3f504d27d7
> | | | 0x405687
> | | | 0x406ef0
> | | | 0x402f31
> | | | 0x3f5041eb1d
> | | --2.30%-- [...]
> | --0.29%-- [...]
> |
> |--4.35%-- mem_cgroup_pre_destroy
> | cgroup_rmdir
> | vfs_rmdir
> | do_rmdir
> | sys_rmdir
> | system_call_fastpath
> | 0x3f504d27d7
> | 0x405687
> | 0x406ef0
> | 0x402f31
> | 0x3f5041eb1d
> --11.47%-- [...]
>
> 7.25% :23586 [kernel] [k] sched_clock_cpu
> |
> --- sched_clock_cpu
> |
> |--97.11%-- update_rq_clock
> | |
> | |--98.89%-- try_to_wake_up
> | | default_wake_function
> | | autoremove_wake_function
> | | __wake_up_common
> | | __wake_up
> | | insert_work
> | | __queue_work
> | | queue_work_on
> | | schedule_work_on
> | | schedule_on_each_cpu
> | | |
> | | |--50.69%-- lru_add_drain_all
> | | | mem_cgroup_force_empty
> | | | mem_cgroup_pre_destroy
> | | | cgroup_rmdir
> | | | vfs_rmdir
> | | | do_rmdir
> | | | sys_rmdir
> | | | system_call_fastpath
> | | | 0x3f504d27d7
> | | | 0x405687
> | | | 0x406ef0
> | | | 0x402f31
> | | | 0x3f5041eb1d
> | | |
> | | --49.31%-- mem_cgroup_force_empty
> | | mem_cgroup_pre_destroy
> | | cgroup_rmdir
> | | vfs_rmdir
> | | do_rmdir
> | | sys_rmdir
> | | system_call_fastpath
> | | 0x3f504d27d7
> | | 0x405687
> | | 0x406ef0
> | | 0x402f31
> | | 0x3f5041eb1d
> | --1.11%-- [...]
> --2.89%-- [...]
>
> 5.54% :23586 [kernel] [k] try_to_wake_up
> |
> --- try_to_wake_up
> |
> |--99.13%-- default_wake_function
> | autoremove_wake_function
> | __wake_up_common
> | __wake_up
> | insert_work
> | __queue_work
> | queue_work_on
> | schedule_work_on
> | schedule_on_each_cpu
> | |
> | |--52.03%-- lru_add_drain_all
> | | mem_cgroup_force_empty
> | | mem_cgroup_pre_destroy
> | | cgroup_rmdir
> | | vfs_rmdir
> | | do_rmdir
> | | sys_rmdir
> | | system_call_fastpath
> | | 0x3f504d27d7
> | | 0x405687
> | | 0x406ef0
> | | 0x402f31
> | | 0x3f5041eb1d
> | |
> | --47.97%-- mem_cgroup_force_empty
> | mem_cgroup_pre_destroy
> | cgroup_rmdir
> | vfs_rmdir
> | do_rmdir
> | sys_rmdir
> | system_call_fastpath
> | 0x3f504d27d7
> | 0x405687
> | 0x406ef0
> | 0x402f31
> | 0x3f5041eb1d
> --0.87%-- [...]
>
> --
> Mark
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/