Re: [PATCH 1/2] Adds a read-only "procs" file similar to "tasks"that shows only unique tgids
From: KAMEZAWA Hiroyuki
Date: Fri Jul 03 2009 - 22:07:40 EST
Paul Menage さんは書きました:
> On Thu, Jul 2, 2009 at 10:54 PM, KAMEZAWA
> Hiroyuki<kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>>
>> Why we can't do what readdir(/proc) does ? I'm sorry I misunderstand.
>> Following is an easy example.
>>
>>
>> 0. at open, inilialize f_pos to 0. f_pos is used as "pid"
>>   remember "css_set with hole" as template in f_private?(or
somewhere) at open
>>   ...like this.
>> --
>>   struct cgroupfs_root *root = cgrp->root;
>>   struct cgroup *template = kzalloc(sizeof(void*) *
CGROUP_SUBSYS_COUNT);
>>
>>   for (i = 0; i < CGROUP_SUBSYS_COUNT; i++)
>>        if (root->subsys_bits & (1UL << i))
>>                template[i] =
 cgrp->subsys[i];
>> --
>>
>>
>> 1. at read(), find task_struct of "pid" in f_pos.
>> 2. look up task_struct of "pid" and compare with f_private
>> --
>>   struct cgroup *template = f_private;
>>
>>   for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>>        if (!template[i])
>>                contiue;
>>        if (template[i] != task_subsys_state(task, i))
>>                break;
>>   }
>>   if (i == CGROUP_SUBSYS_COUNT)
>>        print task;
>
> The problem with this is that the time taken to scan a single cgroup
> is linear in the total number of threads in the system, so if you have
> a lot of threads and a lot of cgroups (even if most of the threads are
> concentrated in a single cgroup) the time taken to scan all the tasks
> files in O(N^2) in the number of threads in the system. The current
> scheme is linear in the number of threads in a cgroup, so looking at
> all cgroups is linear in the number of threads in the system. (This
> O(N^2) problem is something that we've actually observed as an
> overhead on some busy systems at Google).
>
yes. that's a problem. but not far from 'ps' 's performance.
kmalloc() scheme can walk faster than this under heavy memory pressure ?
Anyway, above algorithm shows that it's enough to have per-cgroup bitmap
(size can be dinamically changed) rather than big table and ugly sort().
How about adding per-cgroup taskid bitmap ?
clear/set is very easy.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/