Re: [RFC][PATCH] irq_work

From: Ingo Molnar
Date: Thu Jun 24 2010 - 08:35:51 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Thu, 2010-06-24 at 13:23 +0200, Ingo Molnar wrote:
>
> > What might make sense is to offer two types of callbacks: one that is
> > immediate whenever an event triggers - and another that is sleepable and is
> > executed from process context.
>
> Trouble is waking that thread, you cannot wake tasks from NMI context,
> so whatever you do, you'll end up with a trampoline.
>
> You could of course offer that trampoline nicely packaged, but I'm not
> sure that's really worth the effort.

Right, so there's basically three clean solutions to the 'sleepable callback'
problem, in order of amount of state that needs to be passed to it:

- State-less (or idempotent) events/callbacks: use a hardirq callback to wake
up a well-known process context.

- If we want the task that generates an event to execute a sleeping callback:
use a TIF flag and state in the task itself to pass along the info.

- In the most generic case, if there's arbitrary target task and arbitrary
state that needs to be queued, then to achieve sleepable callbacks the
following solution can be used: the task allocates a perf ring-buffer and
uses a TIF flag to trigger consumption of it.

All memory allocation, wakeup, etc. is handled already by the regular perf
events and ring-buffer codepaths.

No special, open-coded trampolining needed - the ring-buffer is the trampoline
and the ring-buffer consumer can key off the events it receives. (and there
can be multiple consumers of the same event source so we can have in-task
kernel based action combined with a user-space daemon that get an event stream
as well.)

All of these solutions use the fact that perf events are a generic event
framework. If there's any missing details somewhere then fixes/enhancements
can be added - right now our in-kernel event consumers are simple. But the
design is sound.

And none of these solutions involves the incestous low level raping of
softirqs.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/