Re: [PATCH 1/2] Adds a read-only "procs" file similar to "tasks"that shows only unique tgids

From: KAMEZAWA Hiroyuki
Date: Fri Jul 03 2009 - 22:07:40 EST


Paul Menage さんは書きました:
> On Thu, Jul 2, 2009 at 10:54 PM, KAMEZAWA
> Hiroyuki<kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>>
>> Why we can't do what readdir(/proc) does ? I'm sorry I misunderstand.
>> Following is an easy example.
>>
>>
>> 0. at open, inilialize f_pos to 0. f_pos is used as "pid"
>> &#160; remember "css_set with hole" as template in f_private?(or
somewhere) at open
>> &#160; ...like this.
>> --
>> &#160; struct cgroupfs_root *root = cgrp->root;
>> &#160; struct cgroup *template = kzalloc(sizeof(void*) *
CGROUP_SUBSYS_COUNT);
>>
>> &#160; for (i = 0; i < CGROUP_SUBSYS_COUNT; i++)
>> &#160; &#160; &#160; &#160;if (root->subsys_bits & (1UL << i))
>> &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;template[i] =
&#160;cgrp->subsys[i];
>> --
>>
>>
>> 1. at read(), find task_struct of "pid" in f_pos.
>> 2. look up task_struct of "pid" and compare with f_private
>> --
>> &#160; struct cgroup *template = f_private;
>>
>> &#160; for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>> &#160; &#160; &#160; &#160;if (!template[i])
>> &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;contiue;
>> &#160; &#160; &#160; &#160;if (template[i] != task_subsys_state(task, i))
>> &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;break;
>> &#160; }
>> &#160; if (i == CGROUP_SUBSYS_COUNT)
>> &#160; &#160; &#160; &#160;print task;
>
> The problem with this is that the time taken to scan a single cgroup
> is linear in the total number of threads in the system, so if you have
> a lot of threads and a lot of cgroups (even if most of the threads are
> concentrated in a single cgroup) the time taken to scan all the tasks
> files in O(N^2) in the number of threads in the system. The current
> scheme is linear in the number of threads in a cgroup, so looking at
> all cgroups is linear in the number of threads in the system. (This
> O(N^2) problem is something that we've actually observed as an
> overhead on some busy systems at Google).
>
yes. that's a problem. but not far from 'ps' 's performance.
kmalloc() scheme can walk faster than this under heavy memory pressure ?
Anyway, above algorithm shows that it's enough to have per-cgroup bitmap
(size can be dinamically changed) rather than big table and ugly sort().
How about adding per-cgroup taskid bitmap ?
clear/set is very easy.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/