Re: [patch V6 00/16] Improve /proc/interrupts further
From: Shrikanth Hegde
Date: Thu May 21 2026 - 00:35:15 EST
Hi Thomas.
On 5/20/26 8:57 PM, Thomas Gleixner wrote:
Shrikanth!
On Wed, May 20 2026 at 02:48, Shrikanth Hegde wrote:
On 5/18/26 1:31 AM, Thomas Gleixner wrote:
Ran perf stat -r 1000 cat /proc/interrupts > tmp.txt
and Observed minimal improvement with series.
Can you redirect it to /dev/null instead to take the file operations out
of the picture?
Yes. Did "perf stat -r 1000 cat /proc/interrupts > /dev/null".
It shows better improvement with the series compared to file write.
Base:
1,313,263 cycles:HG # 4.063 GHz ( +- 0.17% )
2,172,511 instructions:HG # 1.65 insn per cycle ( +- 0.05% )
v6 series:
1,224,666 cycles:HG # 4.058 GHz ( +- 0.25% )
1,667,435 instructions:HG # 1.36 insn per cycle ( +- 0.08% )
Interesting. The number of instructions goes down by 20+%, but at the
same time IPC drops too.
base:
Performance counter stats for 'cat /proc/interrupts' (1000 runs):
0.32 msec task-clock:HG # 0.615 CPUs utilized ( +- 0.17% )
0 context-switches:HG # 0.000 /sec
0 cpu-migrations:HG # 0.000 /sec
44 page-faults:HG # 136.347 K/sec ( +- 0.03% )
1,310,621 cycles:HG # 4.061 GHz ( +- 0.17% )
2,182,042 instructions:HG # 1.66 insn per cycle ( +- 0.04% )
372,189 branches:HG # 1.153 G/sec ( +- 0.04% )
4,710 branch-misses:HG # 1.27% of all branches ( +- 0.33% )
0.000525061 +- 0.000000889 seconds time elapsed ( +- 0.17% )
v6:
Performance counter stats for 'cat /proc/interrupts' (1000 runs):
0.28 msec task-clock:HG # 0.577 CPUs utilized ( +- 0.25% )
0 context-switches:HG # 0.000 /sec
0 cpu-migrations:HG # 0.000 /sec
44 page-faults:HG # 155.906 K/sec ( +- 0.03% )
1,144,964 cycles:HG # 4.057 GHz ( +- 0.24% )
1,628,375 instructions:HG # 1.42 insn per cycle ( +- 0.07% )
271,934 branches:HG # 963.546 M/sec ( +- 0.07% )
4,683 branch-misses:HG # 1.72% of all branches ( +- 0.49% )
0.00048895 +- 0.00000114 seconds time elapsed ( +- 0.23% ) << 7-8% improvement.
v6+ ppc_hack
Performance counter stats for 'cat /proc/interrupts' (1000 runs):
0.27 msec task-clock:HG # 0.582 CPUs utilized ( +- 0.15% )
0 context-switches:HG # 0.000 /sec
0 cpu-migrations:HG # 0.000 /sec
44 page-faults:HG # 160.232 K/sec ( +- 0.03% )
1,113,983 cycles:HG # 4.057 GHz ( +- 0.15% )
1,501,868 instructions:HG # 1.35 insn per cycle ( +- 0.07% )
251,432 branches:HG # 915.627 M/sec ( +- 0.06% )
4,528 branch-misses:HG # 1.80% of all branches ( +- 0.18% )
0.000472057 +- 0.000000668 seconds time elapsed ( +- 0.14% ) << only slightly better.
Looking at powerpc arch_show_interrupts,
It could use the similar set of optimizations.
- move to array based
- use irq_proc_emit_counts
- some interrupts such as machine check, is hardly set. set skip_vector.
Copilot suggested below diff to quickly try irq_proc_emit_counts integration.
It showed little gains compared to v6. So it maybe worth fixing that in the
right way. (similar to x86 stuff you have done)
Performance counter stats for 'cat /proc/interrupts' (1000 runs):
0.29 msec task-clock:HG # 0.586 CPUs utilized ( +- 0.22% )
0 context-switches:HG # 0.000 /sec
0 cpu-migrations:HG # 0.000 /sec
44 page-faults:HG # 153.067 K/sec ( +- 0.03% )
1,166,567 cycles:HG # 4.058 GHz ( +- 0.22% )
1,475,365 instructions:HG # 1.26 insn per cycle ( +- 0.09% )
249,051 branches:HG # 866.397 M/sec ( +- 0.10% )
5,104 branch-misses:HG # 2.05% of all branches ( +- 0.33% )
0.000490211 +- 0.000000992 seconds time elapsed ( +- 0.20% ) <<< 3-4% improvements.
Again IPC drops ....
Yes. IPC dropping is consistent. I see the same trend in (PATCH 1/16) in the series.
Copying that snippet below.
Before:
8,932,242 instructions # 1.66 insn per cycle ( +- 0.34% )
After:
7,020,982 instructions # 1.30 insn per cycle ( +- 0.52% )
So it might be common pattern across archs. Maybe perf stat subsystem is slow
enough it doesn't shows the aboslute benefit.
In addition, I ran "perf stat -a -r 1000 cat /proc/interrupts > /dev/null"
It is now 10x slower. IPC is same with series And improvement vanishes.
So heavier the infra testing it, gains are getting minimal i guess.
But i don't see any regression.
As you said in the cover-letter, the micro loops you ran maybe the best way to evaluate it.
If you have the code in shareable form, I can give it a try.
Other than that, code improvement looks good to me.