Re: /proc/sys/kernel/pid_max issues

From: Ingo Molnar
Date: Sun Sep 12 2004 - 04:43:55 EST



* Anton Blanchard <anton@xxxxxxxxx> wrote:

> I tried creating 100,000 threads just for the hell of it. I was
> surprised that it appears to have worked even with pid_max set at 32k.
>
> It seems if we are above pid_max we wrap back to RESERVED_PIDS at the
> start of alloc_pidmap but do not enforce this upper limit. I guess
> every call of alloc_pidmap above 32k was wrapping back to
> RESERVED_PIDS, walking the allocated space then allocating off the
> end.

yeah. Does the attached patch fix it?

> Just as an aside, does it make sense to remove the pidmap allocator
> and use the IDR allocator now its there?

might make sense - needs benchmarking. In particular the performance of
kill(pid, 0) [PID lookup] should be benchmarked on the cycle level, and
the combined performance of pthread_create()+pthread_exit().

> Now once I had managed to allocate those 100,000 threads, I noticed
> this:
>
> 18446744071725383682 dr-xr-xr-x 3 root root 0 Sep 12 08:10 100796
>
> Strange huh. Turns out we allocate inodes in proc via:
>
> #define fake_ino(pid,ino) (((pid)<<16)|(ino))
>
> With 32bit inodes we are screwed once pids go over 64k arent we?

indeed.

i'm wondering, dont we have a similar problem with PROC_TID_FD_DIR
already? Running some simple code that opens 1 million files gives:

[root@saturn root]# ulimit -n 1000000
[root@saturn root]# ./open-fds 1000000
999997 fds opened
[root@saturn root]# cd /proc/2333/fd/
[root@saturn fd]# ls -li | grep 153028253
153028253 lrwx------ 1 root root 64 Sep 12 11:18 165533 -> /dev/pts/0
153028253 lrwx------ 1 root root 64 Sep 12 11:18 362141 -> /dev/pts/0
153028253 lrwx------ 1 root root 64 Sep 12 11:18 427677 -> /dev/pts/0
153028253 lrwx------ 1 root root 64 Sep 12 11:18 624285 -> /dev/pts/0
153028253 lrwx------ 1 root root 64 Sep 12 11:19 689821 -> /dev/pts/0
153028253 lrwx------ 1 root root 64 Sep 12 11:18 99997 -> /dev/pts/0
[...]

plenty of overlap in the #ino space.

Ingo
--- linux/kernel/pid.c.orig
+++ linux/kernel/pid.c
@@ -103,7 +103,7 @@ int alloc_pidmap(void)
pidmap_t *map;

pid = last_pid + 1;
- if (pid >= pid_max)
+ if (unlikely(pid >= pid_max))
pid = RESERVED_PIDS;

offset = pid & BITS_PER_PAGE_MASK;
@@ -116,6 +116,10 @@ int alloc_pidmap(void)
* slowpath and that fixes things up.
*/
return_pid:
+ if (unlikely(pid >= pid_max)) {
+ clear_bit(offset, map->page);
+ goto failure;
+ }
atomic_dec(&map->nr_free);
last_pid = pid;
return pid;