Re: [RFC PATCH 7/9] irq_work: Make self-IPIs optable

From: Frederic Weisbecker
Date: Tue Nov 06 2012 - 09:23:53 EST


2012/10/29 Steven Rostedt <rostedt@xxxxxxxxxxx>:
> On Mon, 2012-10-29 at 14:28 +0100, Frederic Weisbecker wrote:
>> On irq work initialization, let the user choose to define it
>> as "lazy" or not. "Lazy" means that we don't want to send
>> an IPI (provided the arch can anyway) when we enqueue this
>> work but we rather prefer to wait for the next timer tick
>> to execute our work if possible.
>>
>> This is going to be a benefit for non-urgent enqueuers
>> (like printk in the future) that may prefer not to raise
>> an IPI storm in case of frequent enqueuing on short periods
>> of time.
>>
>> Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
>> Cc: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx>
>> ---
>> include/linux/irq_work.h | 31 ++++++++++++++++++++++++++
>> kernel/irq_work.c | 53 ++++++++++++++++++++++++++++++++-------------
>> kernel/time/tick-sched.c | 3 +-
>> 3 files changed, 70 insertions(+), 17 deletions(-)
>>
>> diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h
>> index b39ea0b..7b60c87 100644
>> --- a/include/linux/irq_work.h
>> +++ b/include/linux/irq_work.h
>> @@ -4,6 +4,20 @@
>> #include <linux/llist.h>
>> #include <asm/irq_work.h>
>>
>> +/*
>> + * An entry can be in one of four states:
>> + *
>
> Can you add a comment to what the pointer value is. I know you just
> moved it to the header, but it's still confusing.

Which pointer value? You mean the flag? You mean the below need more
details or?

>
>> + * free NULL, 0 -> {claimed} : free to be used
>> + * claimed NULL, 3 -> {pending} : claimed to be enqueued
>> + * pending next, 3 -> {busy} : queued, pending callback
>> + * busy NULL, 2 -> {free, claimed} : callback in progress, can be claimed
>> + */
>> +
>> +#define IRQ_WORK_PENDING 1UL
>> +#define IRQ_WORK_BUSY 2UL
>> +#define IRQ_WORK_FLAGS 3UL
>> +#define IRQ_WORK_LAZY 4UL /* Doesn't want IPI, wait for tick */
[...]
>> @@ -66,10 +56,28 @@ static void __irq_work_queue(struct irq_work *work)
>> preempt_disable();
>>
>> empty = llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
>> - /* The list was empty, raise self-interrupt to start processing. */
>> - if (empty)
>> +
>> + /*
>> + * In any case, raise an IPI if requested and possible in case
>> + * the queue is empty or it's filled with lazy works.
>> + */
>> + if (!(work->flags & IRQ_WORK_LAZY) && arch_irq_work_has_ipi()) {
>> arch_irq_work_raise();
>
> Doesn't this mean that now if we queue up a bunch of work (say in
> tracing), that we will send out an IPI for each queue? We only want to
> send out an IPI if the list isn't empty. Perhaps we should make two
> lists. One for lazy work and one for immediate work. Then, when adding a
> non-lazy work item, we can check the empty variable for that. No need to
> check the result for the lazy queue. That will be done during tick.

Indeed, if an IPI is pending while we queue another work, we'll raise
another one. I would prefer to avoid the complication of adding
another queue though. Perhaps a per cpu "ipi_pending" flag would be
enough. I'll try something.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/