Re: [RFC] ps command race fix

From: Eric W. Biederman
Date: Sun Aug 13 2006 - 14:58:22 EST

Andrew Morton <akpm@xxxxxxxx> writes:

> On Sun, 13 Aug 2006 10:29:51 -0600
> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote:
>> So for systems that are going to be using a larger number of pid
>> values I think we need a better data structure, and containers are
>> likely to push us in that area. Which means either an extensible
>> hash table or radix tree look like the sane choices.
> radix-trees are nice because you can efficiently traverse them in-order
> while the contents are changing (albeit potentially missing newly-added
> things, but that's inevitable).

Actually except when we can't find the process we were just at
the current code doesn't miss any newly added processes. So there
are implementations that missing new entries isn't inevitable.

> radix-trees are not-nice because they allocate memory at insertion time.
> If that's a problem then rbtrees could perhaps be used.

It isn't fundamental, as I do memory allocations on that path. But
it does appear to be a implementation conflict as the current locking
is a spinlock with irqs disabled, and doing a GFP_ATOMIC allocation
to avoid that is fairly silly.

If we were to loose the potential to do rcu traversals when looking
up a single pid that would be make scaling the code much harder.

> idr-trees have similar characteristics to the radix-trees, except a) the
> idr-tree find-next-above feature could perhaps be used for the core pid
> allocation and b) idr-trees don't presently have suitable search functions
> for performing the iteration.

We have to be careful about changing the pid allocation algorithm.
We need to keep the logic where we ensure it will be a long time
before a newly freed pid is reused. Which basically means the current
implementation that walks through possible pids allocating new ones.
The current implementation doesn't give us any guarantees, but it is
much better than the normal allocator algorithm of allocating the next
available pid.

> At least we have plenty of choices ;)

Yes, and I'm still not quite convinced we have a problem that needs a
new data structure. But with everything becoming multi-core there are
serious pressures in that direction.

Anyway the important part is to get a traversal by pid.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at