Re: fast path cycle muncher (vmstat: make vmstat_updater deferrable again and shut down on idle)

From: Christoph Lameter
Date: Mon Jan 25 2016 - 13:02:15 EST


On Mon, 25 Jan 2016, Michal Hocko wrote:

> On Sat 23-01-16 17:21:55, Mike Galbraith wrote:
> > Hi Christoph,
> >
> > While you're fixing that commit up, can you perhaps find a better home
> > for quiet_vmstat()? It not only munches cycles when switching cross
> > -core mightily, for -rt it injects a sleeping lock into the idle task.
> >
> > 12.89% [kernel] [k] refresh_cpu_vm_stats.isra.12
> > 4.75% [kernel] [k] __schedule
> > 4.70% [kernel] [k] mutex_unlock
> > 3.14% [kernel] [k] __switch_to
>
> Hmm, I wouldn't have expected that refresh_cpu_vm_stats could have
> such a large footprint. I guess this would be just an expensive noop
> because we have to check all the zones*counters and do an expensive
> this_cpu_xchg. Is the whole deferred thing worth this overhead?

Why would the deferring cause this overhead?

Also there is no cross core activity from quiet_vmstat(). It simply
disables the local vmstat updates.

> Unless there is a clear and huge win from doing the vmstat update
> deferrable then I think a revert is more appropriate IMHO.

It reduces the OS events that the application experiences by folding it
into the tick events. If its not deferrable then a timer event will be
generated in addition to the tick. We do not want that.

Workqueues are used in many places. If RT can sleep within workqueue
management functions then spinlocks cannot be taken anymore and there may
be issues with preemption.

The regression that I know of (independent of "RT") is due as far as I
know due to the switch of the parameters of some vmstat functions to 64
bit instead of 32 bit.