Re: [PATCH v2 0/4] /proc/stat: Reduce irqs counting performance overhead

From: Matthew Wilcox
Date: Wed Jan 09 2019 - 14:59:48 EST


On Wed, Jan 09, 2019 at 01:54:36PM -0500, Waiman Long wrote:
> If you read patch 4, you can see that quite a bit of CPU cycles was
> spent looking up the radix tree to locate the IRQ descriptor for each of
> the interrupts. Those overhead will still be there even if I use percpu
> counters. So using percpu counter alone won't be as performant as this
> patch or my previous v1 patch.

Hm, if that's the overhead, then the radix tree (and the XArray) have
APIs that can reduce that overhead. Right now, there's only one caller
of kstat_irqs_usr() (the proc code). If we change that to fill an array
instead of returning a single value, it can look something like this:

void kstat_irqs_usr(unsigned int *sums)
{
XA_STATE(xas, &irq_descs, 0);
struct irq_desc *desc;

xas_for_each(&xas, desc, ULONG_MAX) {
unsigned int sum = 0;

if (!desc->kstat_irqs)
continue;
for_each_possible_cpu(cpu)
sum += *per_cpu_ptr(desc->kstat_irqs, cpu);

sums[xas->xa_index] = sum;
}
}