Re: [PATCH] fs/proc: introduce /proc/stat2 file
From: Dave Chinner
Date: Wed Nov 07 2018 - 21:07:59 EST
On Wed, Nov 07, 2018 at 11:03:06AM +0100, Miklos Szeredi wrote:
> On Wed, Nov 7, 2018 at 12:48 AM, Andrew Morton
> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Mon, 29 Oct 2018 23:04:45 +0000 Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> >
> >> On Mon, Oct 29, 2018 at 7:25 PM, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
> >> > This patch introduces a new /proc/stat2 file that is identical to the
> >> > regular 'stat' except that it zeroes all hard irq statistics. The new
> >> > file is a drop in replacement to stat for users that need performance.
> >>
> >> For a while now, I've been thinking over ways to improve the
> >> performance of collecting various bits of kernel information. I don't
> >> think that a proliferation of special-purpose named bag-of-fields file
> >> variants is the right answer, because even if you add a few info-file
> >> variants, you're still left with a situation where a given file
> >> provides a particular caller with too little or too much information.
> >> I'd much rather move to a model in which userspace *explicitly* tells
> >> the kernel which fields it wants, with the kernel replying with just
> >> those particular fields, maybe in their raw binary representations.
> >> The ASCII-text bag-of-everything files would remain available for
> >> ad-hoc and non-performance critical use, but programs that cared about
> >> performance would have an efficient bypass. One concrete approach is
> >> to let users open up today's proc files and, instead of read(2)ing a
> >> text blob, use an ioctl to retrieve specified and targeted information
> >> of the sort that would normally be encoded in the text blob. Because
> >> callers would open the same file when using either the text or binary
> >> interfaces, little would have to change, and it'd be easy to implement
> >> fallbacks when a particular system doesn't support a particular
> >> fast-path ioctl.
>
> Please. Sysfs, with the one value per file rule, was created exactly
> for the purpose of eliminating these sort of problems with procfs. So
> instead of inventing special purpose interfaces for proc, just make
> the info available in sysfs, if not already available.
This doesn't solve the problem.
The problem is that this specific implementation of per-cpu
counters need to be summed on every read. Hence when you have a huge
number of CPUs each per-cpu iteration that takes a substantial
amount of time.
If only we had percpu counters that had a fixed, extremely low read
overhead that doesn't care about the number of CPUs in the
machine....
Oh, wait, we do: percpu_counters.[ch].
This all seems like a counter implementation deficiency to me, not
an interface problem...
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx