RE: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath

Date: Mon Aug 10 2015 - 05:46:27 EST


> -----Original Message-----
> From: Andrew Morton [mailto:akpm@xxxxxxxxxxxxxxxxxxxx]
> Sent: Saturday, August 08, 2015 4:06 AM
> Cc: 'Michal Hocko'; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> minchan@xxxxxxxxxx; dave@xxxxxxxxxxxx; koct9i@xxxxxxxxx;
> mgorman@xxxxxxx; vbabka@xxxxxxx; js1304@xxxxxxxxx;
> hannes@xxxxxxxxxxx; alexander.h.duyck@xxxxxxxxxx;
> sasha.levin@xxxxxxxxxx; cl@xxxxxxxxx; fengguang.wu@xxxxxxxxx;
> cpgs@xxxxxxxxxxx; pintu_agarwal@xxxxxxxxx; pintu.k@xxxxxxxxxxx;
> Subject: Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath
> On Fri, 07 Aug 2015 18:16:47 +0530 PINTU KUMAR <pintu.k@xxxxxxxxxxx>
> wrote:
> > > > This is useful to know the rate of allocation success within the
> > > > slowpath.
> > >
> > > What would be that information good for? Is a regular administrator
> > > expected
> > to
> > > consume this value or this is aimed more to kernel developers? If
> > > the later
> > then I
> > > think a trace point sounds like a better interface.
> > >
> > This information is good for kernel developers.
> > I found this information useful while debugging low memory situation
> > and sluggishness behavior.
> > I wanted to know how many times the first allocation is failing and
> > how many times system entering slowpath.
> > As I said, the existing counter does not give this information clearly.
> > The pageoutrun, allocstall is too confusing.
> > Also, if kswapd and compaction is disabled, we have no other counter
> > for slowpath (except allocstall).
> > Another problem is that allocstall can also be incremented from
> > hibernation during shrink_all_memory calling.
> > Which may create more confusion.
> > Thus I found this interface useful to understand low memory behavior.
> > If device sluggishness is happening because of too many slowpath or
> > due to some other problem.
> > Then we can decide what will be the best memory configuration for my
> > device to reduce the slowpath.
> >
> > Regarding trace points, I am not sure if we can attach counter to it.
> > Also trace may have more over-head and requires additional configs to
> > be enabled to debug.
> > Mostly these configs will not be enabled by default (at least in
> > embedded, low memory device).
> > I found the vmstat interface more easy and useful.
> This does seem like a pretty basic and sensible thing to expose in vmstat. It
> probably makes more sense than some of the other things we have in there.
Thanks Andrew.
Yes, as par my analysis, I feel that this is one of the useful and important
I added it in one of our internal product and found it to be very useful.
Specially during shrink_memory and compact_nodes analysis I found it really
It helps me to prove that if higher-order pages are present, it can reduce the
slowpath drastically.
Also during my ELC presentation people asked me how to monitor the slowpath

> Yes, it could be a tracepoint but practically speaking, a tracepoint makes it
> developer-only. You can ask a bug reporter or a customer "what is
> /proc/vmstat:slowpath_entered" doing, but it's harder to ask them to set up
> tracing.
Yes, at times tracing are painful to analyze.
Also, in commercial user binaries, most of tracing support are disabled (with no
root privileges).
However, /proc/vmstat works with normal user binaries.
When memory issues are reported, we just get log dumps and few interfaces like
Most of the time these memory issues are hard to reproduce because it may happen
after long usage.

> And I don't think this will lock us into anything - vmstat is a big dumping
> and I don't see a big problem with removing or changing things later on. IMO,
> debugfs rules apply here and vmstat would be in debugfs, had debugfs existed
> the time.
> Two things:
> - we appear to have forgotten to document /proc/vmstat
Yes, I could not find any document on vmstat under kernel/Documentation.
I think it's a nice think to have.
May be, I can start this initiative to create one :)
If respective owner can update, it will be great.

> - How does one actually use slowpath_entered? Obviously we'd like to
> know "what proportion of allocations entered the slowpath", so we
> calculate
> slowpath_entered/X
> how do we obtain "X"? Is it by adding up all the pgalloc_*? If
> so, perhaps we should really have slowpath_entered_dma,
> slowpath_entered_dma32, ...?

I think the slowpath for other zones may not be required.
We just need to know how many times we entered slowpath and possibly do
something to reduce it.
But, I think, pgalloc_* count may also include success for fastpath.

How I use slowpath for analysis is:
---------- ---------- ---------- ------------
nr_free_pages 6726 12494 46.17%
pgalloc_normal 985836 1549333 36.37%
pageoutrun 2699 529 80.40%
allocstall 298 98 67.11%
slowpath_entered 16659 739 95.56%
compact_stall 244 21 91.39%
compact_fail 178 11 93.82%
compact_success 52 7 86.54%

The above values are from 512MB system with only NORMAL zone.
Before, the slowpath count was 16659.
After (memory shrinker + compaction), the slowpath reduced by 95%, for the same
This is just an example.

If we are interested to know even allocation success/fail ratio in slowpath,
then I think we need more counters.
Such as; direct_reclaim_success/fail, kswapd_success/fail (just like compaction
OR, we can have pgalloc_success_fastpath counter.
Then we can do:
pgalloc_success_in_slowpath = (pgalloc_normal - pgalloc_success_fastpath)
Therefore, success_ratio for slowpath could be;

(pgalloc_success_in_slowpath/slowpath_entered) * 100

More comments, welcome.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at