Re: [PATCH RFC] reduce runqueue lock contention

From: Chris Mason
Date: Thu May 20 2010 - 18:23:19 EST


On Thu, May 20, 2010 at 11:09:46PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-05-20 at 16:48 -0400, Chris Mason wrote:
> >
> > This is more of a starting point than a patch, but it is something I've
> > been meaning to look at for a long time. Many different workloads end
> > up hammering very hard on try_to_wake_up, to the point where the
> > runqueue locks dominate CPU profiles.
>
> Right, so one of the things that I considered was to make p->state an
> atomic_t and replace the initial stage of try_to_wake_up() with
> something like:
>
> int try_to_wake_up(struct task *p, unsigned int mask, wake_flags)
> {
> int state = atomic_read(&p->state);
>
> do {
> if (!(state & mask))
> return 0;
>
> state = atomic_cmpxchg(&p->state, state, TASK_WAKING);
> } while (state != TASK_WAKING);
>
> /* do this pending queue + ipi thing */
>
> return 1;
> }
>
> Also, I think we might want to put that atomic single linked list thing
> into some header (using atomic_long_t or so), because I have a similar
> thing living in kernel/perf_event.c, that needs to queue things from NMI
> context.

So I've done three of these cmpxchg lists recently...but they have all
been a little different. I went back and forth a bunch of times about
using a list_head based thing instead to avoid the walk for list append.
I really don't like the walk.

But, what makes this one unique is that I'm using a cmpxchg on the list
pointer in the in task struct to take ownership of this task struct.
It is how I avoid concurrent lockless enqueues.

Your fiddling with the p->state above would let me avoid that.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/