Re: [PATCH 2/3] mm: page allocator: Calculate a better estimate ofNR_FREE_PAGES when memory is low and kswapd is awake

From: Andrew Morton
Date: Fri Sep 03 2010 - 19:30:08 EST


On Fri, 3 Sep 2010 18:17:46 -0500 (CDT)
Christoph Lameter <cl@xxxxxxxxx> wrote:

> On Fri, 3 Sep 2010, Andrew Morton wrote:
>
> > Can someone remind me why per_cpu_pageset went and reimplemented
> > percpu_counters rather than just using them?
>
> The vm counters are per zone and per cpu and have a flow from per cpu /
> zone deltas to zone counters and then also into global counters.

hm. percpu counters would require overflow-time hooks to do that.
Might be worth looking at.

> > Is this really the best way of doing it? The way we usually solve
> > this problem (and boy, was this bug a newbie mistake!) is:
> >
> > foo = percpu_counter_read(x);
> >
> > if (foo says something bad) {
> > /* Bad stuff: let's get a more accurate foo */
> > foo = percpu_counter_sum(x);
> > }
> >
> > if (foo still says something bad)
> > do_bad_thing();
> >
> > In other words, don't do all this stuff with percpu_drift_mark and the
> > kswapd heuristic. Just change zone_watermark_ok() to use the more
> > accurate read if it's about to return "no".
>
> percpu counters must always be added up when their value is determined.

Nope. That's the difference between percpu_counter_read() and
percpu_counter_sum().

> This seems to be a special case here where Mel does not want to have to
> cost to bring the counters up to date nor reduce the delta/time limits to
> get some more accuracy but wants take some sort of snapshot of the whole
> situation for this particular case.

My suggestion didn't actually have anything to do with percpu_counters.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/