Re: [patch V2 1/2] genriq: Avoid summation loops for /proc/stat

From: Andrew Morton
Date: Fri Feb 08 2019 - 18:21:56 EST


On Fri, 8 Feb 2019 17:46:39 -0500 Waiman Long <longman@xxxxxxxxxx> wrote:

> On 02/08/2019 05:32 PM, Andrew Morton wrote:
> > On Fri, 08 Feb 2019 14:48:03 +0100 Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> >> Waiman reported that on large systems with a large amount of interrupts the
> >> readout of /proc/stat takes a long time to sum up the interrupt
> >> statistics. In principle this is not a problem. but for unknown reasons
> >> some enterprise quality software reads /proc/stat with a high frequency.
> >>
> >> The reason for this is that interrupt statistics are accounted per cpu. So
> >> the /proc/stat logic has to sum up the interrupt stats for each interrupt.
> >>
> >> This can be largely avoided for interrupts which are not marked as
> >> 'PER_CPU' interrupts by simply adding a per interrupt summation counter
> >> which is incremented along with the per interrupt per cpu counter.
> >>
> >> The PER_CPU interrupts need to avoid that and use only per cpu accounting
> >> because they share the interrupt number and the interrupt descriptor and
> >> concurrent updates would conflict or require unwanted synchronization.
> >>
> >> ...
> >>
> >> --- a/include/linux/irqdesc.h
> >> +++ b/include/linux/irqdesc.h
> >> @@ -65,6 +65,7 @@ struct irq_desc {
> >> unsigned int core_internal_state__do_not_mess_with_it;
> >> unsigned int depth; /* nested irq disables */
> >> unsigned int wake_depth; /* nested wake enables */
> >> + unsigned int tot_count;
> > Confused. Isn't this going to quickly overflow?
> >
> >
> All the current irq count computations for each individual irqs are
> using unsigned int type. Only the sum of all the irqs is u64. Yes, it is
> possible for an individual irq count to exceed 32 bits given sufficient
> uptime.  My PC has an uptime of 36 days and the highest irq count value
> is 79,227,699. Given the current rate, the overflow will happen after
> about 5 years. A larger server system may have an overflow in much
> shorter period. So maybe we should consider changing all the irq counts
> to unsigned long then.

It sounds like it. A 10khz interrupt will overflow in 4 days...