Re: [Linux] Linux PID algorithm is BRAINDEAD!

From: yalin wang
Date: Fri Oct 09 2015 - 23:20:51 EST

Next message: Tejun Heo: "[PATCHSET cgroup/for-4.4] cgroup: make zombies retain cgroup membership and fix pids controller"
Previous message: Krzysztof Kozlowski: "Re: [RFT 0/3] usb: usb3503: Fix probing on Arndale board (missing phy)"
In reply to: Dave Goel: "[Linux] Linux PID algorithm is BRAINDEAD!"
Next in thread: Theodore Ts'o: "Re: [Linux] Linux PID algorithm is BRAINDEAD!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> On Oct 10, 2015, at 10:00, Dave Goel <deego3@xxxxxxxxx> wrote:
>
> Pardon the subject line! I think the PID algo. is actually pretty
> good and cheap.
>
>
> I just think that a very minor tweak could actually make it *actually* do
> what it always intended to do (that is, satisfy the PID-AIM listed below)!
> No expanded PID renumbering, no incompatibility introduction, nothing, just
> a relatively *minor tweak*.
>
>
> *** QUICK SUMMARY:
>
>
> PROBLEM: PID gets re-used immediately as soon as it is freed.
>
> EXAMPLE: My program with PID 1323 uses temporary files that are of the form
> PID-<something>. It finishes in two days By this time, linux has circled
> PIDs back around five times. And, the counter happens to be right at 1323
> right as the program finishes. As soon as the PID is freed, Linux thus
> re-uses the PID 1323. A trash-cleaner program (I know, I know) gets
> confused. 1323 exited uncleanly. The cleaner sees no 1323 running. It then
> proceeds to delete the temp files. But, by that time, the kernel started
> another 1323 which happens to use the very files. The cleaner ends up
> deleting files for the wrong, running program!
>
> This is just one example. There are many more examples of race conditions.
>
>
>
> A TINY TWEAK AS SOLUTION:
>
> The kernel already tries to do the right thing. The only minor tweak needed
> is that:
>
> ***When Linux Re-Uses A Pid, It Uses The Pid That Has Been Free For
> The Longest.***
>
> That's it!
>
>
> RESULT:
>
> All PID-related race conditions are eliminated, period. No "good enough"
> hacks needed any more by utilities trying to avoid race conditions. A
> freshly freed PID will never get re-used immediately any more . No more race
> conditions. (The only time you will ever see an immediate re-use any more is
> when your machine actually has 32767(!) or (2^22-1) processes! And, by
> that time, you have FAR bigger problems.)
>
>
> ----
>
>
> **** DETAILS:
>
>
> You don't have to google very much to see the 1000 algos and other bashisms
> that exist to avoid the very race condition. For example, when you want to
> read a PID list and deleting temporary files based on a PID. The concern
> is that Linux might have created a new process with the same PID by the
> time you read the file-list. We could argue endlessly that these bashisms
> are stupid and there are better ways. But, it seems to me that these better
> ways are not foolproof either; they are themselves hacks. And, a very simple
> tweak could alleviate 100% of these problems.
>
> Now, 32768, or even 2^22 are actually very small numbers. OTOH, 2^200 is
> not. In an ideal world, the PID would just sample from the 2^200 space and
> declare that there will never be PID re-use or conflicts, period. If there's
> a 32-bit limit, it could use multiple bytes and still use a 2^200 space,
> etc. But, all that would be a drastic change, which is why we are stuck
> with the 2^15 space.
>
> Is there a way to continue using the 2^15 (or 2^22) space, but more
> reasonably?
>
> I argue that the kernel already tries to satisfy a "PID-AIM" and already
> tries to Do The Right Thing, but there's just a tiny thinko in its current
> implementation that's easily corrected.
>
>
> PID-AIM:
> "No immediate re-use." The aim is to NOT immediately re-use a PID right
> after it's been freed. I argue that this aim can easily be satisfied, even
> within the limited 2^15 space.
>
>
> CURRENT IMPLEMENTATION:
> The creators had the right idea. Linux indeed tries to satisfy the PID-AIM
> condition. This is why it increments the counter even if the PID is actually
> available.
>
> But, looping happens (within a few hours for me). And, as soon as looping
> happens, satisfying the PID-AIM goes out the window.
>
> This tweak would ensure that immediate re-use never happens, period.
>
> COMPLEXITY, ETC:
>
> All that the entire system needs is one queue of free PIDs. Any time you
> need a PID, take it from the head. Any time a PID is newly freed, push it at
> the back of the queue. That's it! The overhead seems minimal to me.
>
> The queue is initially populated by 2-32768, of course.
>
> In fact, we could even use a smaller queue and not even activate the
> queue-mode till it is actually necessary; we could use various optimizing
> conditions and shortcuts, say, to push PIDs in the queue in batches. After
> all, it's not STRICTLY necessary to maintain exactly the correct order. The
> ONLY aim we really want to satisfy is that the latest-freed PID NOT go near
> the *head* of the queue; that's all. Also, the queue is only even needed in
> the first place until you've actually looped around. So, tiny rescue disk
> bootups would not even need to go the queue-mode.. (unless they've been
> running many days.)
>
>
> (Thanks to jd_1 and _mjg on for humoring and discussing this idea when I
> presented it on ##kernel.)
>
>
> Dave Goel (Please CC responses.)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
i see there is
pid_ns->last_pid,
it will allocate new pid after last_pid,
this means it will allocate large number pid unless pid wrap ,

the same result as you describe queue method .
i think .

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tejun Heo: "[PATCHSET cgroup/for-4.4] cgroup: make zombies retain cgroup membership and fix pids controller"
Previous message: Krzysztof Kozlowski: "Re: [RFT 0/3] usb: usb3503: Fix probing on Arndale board (missing phy)"
In reply to: Dave Goel: "[Linux] Linux PID algorithm is BRAINDEAD!"
Next in thread: Theodore Ts'o: "Re: [Linux] Linux PID algorithm is BRAINDEAD!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]