Re: CPU POSIX timers livelock

From: Frank Mayhar
Date: Fri May 02 2008 - 13:11:30 EST


On Fri, 2008-05-02 at 18:13 +0200, Petr Tesarik wrote:
> On Fri, 2008-05-02 at 17:21 +0200, Björn Steinbrink wrote:
> > [Added Roland McGrath and Frank Mayhar to Cc:, as this sounds similar
> > enough to what has been discussed here http://lkml.org/lkml/2008/2/6/505]
>
> Yes, I've just now found the thread too, read it, and I think this is
> just another case where the current implementation does not scale.
>
> Was there any followup to the patch posted on the 7th of March? The
> interesting discussion seems to be interrupted there. :(

Roland and I have continued the conversation but we took it off the LKML
since it was really getting into the nitty-gritty details of the
implementation and wasn't that interesting to someone not actually
directly involved.

The upshot is that I have a proposed patch that I have handed to Roland
to review. He's pretty busy, though, so he may not have gotten to it
yet. Perhaps this thread will give him further incentive. :-)

Petr's analysis pretty much matches mine, except that he went into a bit
more detail in actually computing numbers and whatnot whereas I just
reasoned that with a sufficiently large number of threads pretty much
any process that uses POSIX timers can cause the system to livelock,
simply because repeatedly running the thread group list in
run_posix_cpu_timers() will at some point take as long as the timer tick
itself.

My proposed patch that Roland is reviewing corrects the implementation
of run_posix_cpu_timers() to make it run in constant time for a
particular machine by defining a couple of new structures and keeping
the thread group timers in one of these. It's way more complex than
this and I have a README that goes into detail if anyone is interested.

I've tested the patch with as many as 200,000 threads (all of which are
running a prime number sieve and are therefore CPU bound) and it appears
to work fine. Before I post it again, though, I want Roland to sign off
on it.
--
Frank Mayhar <fmayhar@xxxxxxxxxx>
Google, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/