Re: [PATCH 0/6] sched,delayacct: Some cleanups

From: Peter Zijlstra
Date: Thu May 06 2021 - 05:16:59 EST


On Thu, May 06, 2021 at 08:29:40AM +1000, Balbir Singh wrote:
> On Wed, May 05, 2021 at 12:59:40PM +0200, Peter Zijlstra wrote:
> > Hi,
> >
> > Due to:
> >
> > https://lkml.kernel.org/r/0000000000001d43ac05c0f5c6a0@xxxxxxxxxx
> >
> > and general principle, delayacct really shouldn't be using ktime (pvclock also
> > really shouldn't be doing what it does, but that's another story). This lead me
> > to looking at the SCHED_INFO, SCHEDSTATS, DELAYACCT (and PSI) accounting hell.
> >
> > The rest of the patches are an attempt at simplifying all that a little. All
> > that crud is enabled by default for distros which is leading to a death by a
> > thousand cuts.
> >
> > The last patch is an attempt at default disabling DELAYACCT, because I don't
> > think anybody actually uses that much, but what do I know, there were no ill
> > effects on my testbox. Perhaps we should mirror
> > /proc/sys/kernel/sched_schedstats and provide a delayacct sysctl for runtime
> > frobbing.
> >
>
> There are tools like iotop that use delayacct to display information.

Right, but how many actual people use that? Does that justify saddling
the whole sodding world with the overhead?

> When the
> code was checked in, we did run SPEC* back in the day 2006 to find overheads,
> nothing significant showed. Do we have any date on the overhead your seeing?

I've not looked, but having it disabled saves that per-task allocation
and that spinlock in delayacct_end() for iowait wakeups and a bunch of
cache misses ofcourse.

I doubt SPEC is a benchmark that tickles those paths much if at all.

The thing is; we can't just keep growing more and more stats, that'll
kill us quite dead.