Re: [PATCH 1/2] Adds a read-only "procs" file similar to "tasks" that shows only unique tgids

From: Eric W. Biederman
Date: Wed Jul 15 2009 - 04:33:17 EST


Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes:

> On Fri, 3 Jul 2009 10:54:48 -0700 Paul Menage <menage@xxxxxxxxxx> wrote:
>
>> >__Unfortunately radix-trees are presented as operating on
>> > void* data, so one would need to do some typecasting when storing
>> > BITS_PER_LONG-sized bitfields inside them.
>>
>> That would mean adding something a bit like the IDA wrapper that
>> converts IDR to deal with bitfields?
>
> I guess so.
>
>> Is the benefit of avoiding a vmalloc() at all costs really worth the
>> additional complexity
>
> Well no. But nor was it worth the additional complexity the last twenty
> times someone resorted to vmalloc to solve a problem of this nature. Taking
> a kernel-wide perspective here gives a different answer.
>
> However I don't think a little scoreboarding thing (what's the correct
> term) built around radix-trees would suffice to solve many of those
> past sins. Whereas a general dynamic array thing would be applicable
> in many cases.

It is even easier. Just grab the logic from proc_pid_readdir.
It uses rcu locking.
It returns pids in order.
It needs no mallocs to use.

The people who ran benchmarks tell me I actually sped up proc,
when I started traversing the existing bitmap of pids.

It makes the guarantee that for every process that existed for
the length of the operation you will see it's pid. Processes
that die or were born half way through we don't say anything about
but if they stay around you will get them next time.

I think guaranteeing a truly atomic snapshot is likely to be a
horrible idea requiring all kinds of nasty locking, and smp
scalability issues. So please walk the list of pids and
just return those that belong to your cgroup.

Compare to the rest of the implementations everyone is balking
at it should be simple.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/