Re: [PATCH v2 2/2] try to fix /proc/stat scalability of irq sum ofall cpu

From: Andrew Morton
Date: Thu Oct 21 2010 - 15:59:10 EST


On Thu, 21 Oct 2010 16:03:43 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>
> In /proc/stat, the number of per-IRQ event is shown by making a sum
> each irq's events on all cpus. But we can make use of kstat_irqs().
>
> kstat_irqs() do the same calculation, If !CONFIG_GENERIC_HARDIRQ,
> it's not a big cost. (Both of the number of cpus and irqs are small.)
>
> If a system is very big and CONFIG_GENERIC_HARDIRQ, it does
>
> for_each_irq()
> for_each_cpu()
> - look up a radix tree
> - read desc->irq_stat[cpu]
> This seems not efficient. This patch adds kstat_irqs() for
> CONFIG_GENRIC_HARDIRQ and change the calculation as
>
> for_each_irq()
> look up radix tree
> for_each_cpu()
> - read desc->irq_stat[cpu]
>
> This reduces cost.
>
> A test on (4096cpusp, 256 nodes, 4592 irqs) host (by Jack Steiner)
>
> %time cat /proc/stat > /dev/null
>
> Before Patch: 2.459 sec
> After Patch : .561 sec
>
> Changelog:
> - rebased onto mmotm-1020 (kernel/irq/handle.c is modified)
>
> Tested-by: Jack Steiner <steiner@xxxxxxx>
> Acked-by: Jack Steiner <steiner@xxxxxxx>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> ---
> fs/proc/stat.c | 9 ++-------
> include/linux/kernel_stat.h | 4 ++++
> kernel/irq/irqdesc.c | 16 ++++++++++++++++
> 3 files changed, 22 insertions(+), 7 deletions(-)
>
> Index: mmotm-1020/fs/proc/stat.c
> ===================================================================
> --- mmotm-1020.orig/fs/proc/stat.c
> +++ mmotm-1020/fs/proc/stat.c
> @@ -108,13 +108,8 @@ static int show_stat(struct seq_file *p,
> seq_printf(p, "intr %llu", (unsigned long long)sum);
>
> /* sum again ? it could be updated? */
> - for_each_irq_nr(j) {
> - per_irq_sum = 0;
> - for_each_possible_cpu(i)
> - per_irq_sum += kstat_irqs_cpu(j, i);
> -
> - seq_printf(p, " %u", per_irq_sum);
> - }
> + for_each_irq_nr(j)
> + seq_printf(p, " %u", kstat_irqs(j));
>
> seq_printf(p,
> "\nctxt %llu\n"
> Index: mmotm-1020/include/linux/kernel_stat.h
> ===================================================================
> --- mmotm-1020.orig/include/linux/kernel_stat.h
> +++ mmotm-1020/include/linux/kernel_stat.h
> @@ -86,6 +86,7 @@ static inline unsigned int kstat_softirq
> /*
> * Number of interrupts per specific IRQ source, since bootup
> */
> +#ifndef CONFIG_GENERIC_HARDIRQS
> static inline unsigned int kstat_irqs(unsigned int irq)
> {
> unsigned int sum = 0;
> @@ -96,6 +97,9 @@ static inline unsigned int kstat_irqs(un
>
> return sum;
> }
> +#else
> +extern unsigned int kstat_irqs(unsigned int irq);
> +#endif

hrm, why on earth was that inlined.

> /*
> * Number of interrupts per cpu, since bootup
> Index: mmotm-1020/kernel/irq/irqdesc.c
> ===================================================================
> --- mmotm-1020.orig/kernel/irq/irqdesc.c
> +++ mmotm-1020/kernel/irq/irqdesc.c
> @@ -393,3 +393,19 @@ unsigned int kstat_irqs_cpu(unsigned int
> struct irq_desc *desc = irq_to_desc(irq);
> return desc ? desc->kstat_irqs[cpu] : 0;
> }
> +
> +#ifdef CONFIG_GENERIC_HARDIRQS
> +unsigned int kstat_irqs(unsigned int irq)
> +{
> + struct irq_desc *desc = irq_to_desc(irq);
> + int cpu;
> + int sum = 0;
> +
> + if (!desc)
> + return 0;
> + for_each_possible_cpu(cpu)
> + sum += desc->kstat_irqs[cpu];
> + return sum;
> +}
> +EXPORT_SYMBOL_GPL(kstat_irqs);

kstat_irqs() needs to be exported to modules because of some silliness
in drivers/isdn/hisax/config.c. But in linux-next that silliness got
deleted and the kstat_irqs export was removed.

--- a/kernel/irq/irqdesc.c~proc-stat-fix-scalability-of-irq-sum-of-all-cpu-fix
+++ a/kernel/irq/irqdesc.c
@@ -395,7 +395,7 @@ unsigned int kstat_irqs_cpu(unsigned int
}

#ifdef CONFIG_GENERIC_HARDIRQS
-unsigned int kstat_irqs(unsigned int irq)
+unsigned int kstat_irqs(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);
int cpu;
@@ -407,5 +407,4 @@ unsigned int kstat_irqs(unsigned int ir
sum += desc->kstat_irqs[cpu];
return sum;
}
-EXPORT_SYMBOL_GPL(kstat_irqs);
-#endif /*CONFIG_GENERIC_HARDIRQS*/
+#endif /* CONFIG_GENERIC_HARDIRQS */
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/