Re: [patch 1/2] genriq: Avoid summation loops for /proc/stat
From: Waiman Long
Date: Wed Jan 30 2019 - 11:00:16 EST
On 01/30/2019 07:31 AM, Thomas Gleixner wrote:
> Waiman reported that on large systems with a large amount of interrupts the
> readout of /proc/stat takes a long time to sum up the interrupt
> statistics. In principle this is not a problem. but for unknown reasons
> some enterprise quality software reads /proc/stat with a high frequency.
>
> The reason for this is that interrupt statistics are accounted per cpu. So
> the /proc/stat logic has to sum up the interrupt stats for each interrupt.
>
> This can be largely avoided for interrupts which are not marked as
> 'PER_CPU' interrupts by simply adding a per interrupt summation counter
> which is incremented along with the per interrupt per cpu counter.
>
> The PER_CPU interrupts need to avoid that and use only per cpu accounting
> because they share the interrupt number and the interrupt descriptor and
> concurrent updates would conflict or require unwanted synchronization.
>
> Reported-by: Waiman Long <longman@xxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> 8<-------------
>
> include/linux/irqdesc.h | 3 ++-
> kernel/irq/chip.c | 12 ++++++++++--
> kernel/irq/internals.h | 8 +++++++-
> kernel/irq/irqdesc.c | 7 ++++++-
> 4 files changed, 25 insertions(+), 5 deletions(-)
>
>
> --- a/include/linux/irqdesc.h
> +++ b/include/linux/irqdesc.h
> @@ -65,9 +65,10 @@ struct irq_desc {
> unsigned int core_internal_state__do_not_mess_with_it;
> unsigned int depth; /* nested irq disables */
> unsigned int wake_depth; /* nested wake enables */
> + unsigned int tot_count;
> unsigned int irq_count; /* For detecting broken IRQs */
> - unsigned long last_unhandled; /* Aging timer for unhandled count */
> unsigned int irqs_unhandled;
> + unsigned long last_unhandled; /* Aging timer for unhandled count */
> atomic_t threads_handled;
> int threads_handled_last;
> raw_spinlock_t lock;
Just one minor nit. Why you want to move the last_unhandled down one
slot? There were 5 int's before. Adding one more will just fill the
padding hole. Moving down the last_unhandled will probably leave 4-byte
holes in both above and below it assuming that raw_spinlock_t is 4 bytes.
Cheers,
Longman