Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

From: Ingo Molnar
Date: Mon Jan 26 2009 - 18:01:06 EST



* Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, 26 Jan 2009 23:20:02 +0100
> Ingo Molnar <mingo@xxxxxxx> wrote:
>
> >
> > * Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Mon, 26 Jan 2009 23:05:37 +0100
> > > Ingo Molnar <mingo@xxxxxxx> wrote:
> > >
> > > >
> > > > * Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > Well it turns out that I was having a less-than-usually-senile moment:
> > > > >
> > > > > : implement flush_work()
> > > >
> > > > > Why isn't that working in this case??
> > > >
> > > > how would that work in this case? We defer processing into the workqueue
> > > > exactly because we want its per-CPU properties.
> > >
> > > It detaches the work item, moves it to head-of-queue, reinserts it then
> > > waits on it. I think.
> > >
> > > This might have a race+hole. If a currently-running "unrelated" work
> > > item tries to take the lock which the flush_work() caller is holding
> > > then there's no way in which keventd will come back to execute the work
> > > item which we just put on the head of queue.
> >
> > Correct - or the unrelated worklet might also be blocked on something - so
> > the window is rather large.
> >
>
> hm, OK, that sucks.
>
> But the deadlock still exists with Rusty's patches, doesn't it? We
> still have a single kernel thread per CPU processing all the unrelated
> work_on_cpu() callers. All we've done is to decouple work_on_cpu() from
> the keventd queue.

This particular deadlock does not exist - but you are indeed right that
similar types of 'unrelated' interactions might exist in the future, as
the usage of this facility is extended.

> If correct, we'd need to create a gaggle of kernel threads on each call
> to work_on_cpu(), which doesn't sound nice.
>
> A more efficient but trickier approach would be to create kernel threads
> within flush_work(), with which to run the CPU-specific worklet. We
> only need to do that in the case where the CPU's keventd thread was off
> doing something and might deadlock, which will be rare. If the keventd
> was just parked waiting for something to do then we can safely feed it
> the to-be-flushed work item for immediate processing.

i think what you describe is a variant of the syslet thread pool ;-)

> It'd be saner to just say "don't call work_on_cpu() while holding locks"
> :( I bet there's some lockdep infrastructre which we could peek into to
> add the assertion check...

The problem isnt doing the assertions - lockdep already covers workqueue
dependencies very efficiently.

The problem is the intrinsic utility of work_on_cpu(): we _really_ want
such a generic facility to be usable from any (blockable) context, just
like on_each_cpu(func, info) does for atomic functions, without
restrictions on locking context.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/