Re: [PATCH 4/4] Memory controller soft limit reclaim on contention (v3)

From: KOSAKI Motohiro
Date: Tue Mar 03 2009 - 19:07:43 EST


Hi Balbir

> > > > kswapd's roll is increasing free pages until zone->pages_high in "own node".
> > > > mem_cgroup_soft_limit_reclaim() free one (or more) exceed page in any node.
> > > >
> > > > Oh, well.
> > > > I think it is not consistency.
> > > >
> > > > if mem_cgroup_soft_limit_reclaim() is aware to target node and its pages_high,
> > > > I'm glad.
> > >
> > > Yes, correct the role of kswapd is to keep increasing free pages until
> > > zone->pages_high and the first set of pages to consider is the memory
> > > controller over their soft limits. We pass the zonelist to ensure that
> > > while doing soft reclaim, we focus on the zonelist associated with the
> > > node. Kamezawa had concernes over calling the soft limit reclaim from
> > > __alloc_pages_internal(), did you prefer that call path?
> >
> > I read your patch again.
> > So, mem_cgroup_soft_limit_reclaim() caller place seems in balance_pgdat() is better.
> >
> > Please imazine most bad scenario.
> > CPU0 (kswapd) take to continue shrinking.
> > CPU1 take another activity and charge memcg conteniously.
> > At that time, balance_pgdat() don't exit very long time. then
> > mem_cgroup_soft_limit_reclaim() is never called.
> >
>
> Yes, true... that is why I added the hooks in __alloc_pages_internal()
> in the first two revisions, but Kamezawa objected to them. In the
> scenario that you mention that balance_pgdat() is busy, if we are
> under global system memory pressure, even after freeing memory from
> soft limited cgroups, we don't have sufficient free memory. We need to
> go reclaim from the whole system. An administrator can easily avoid
> the above scenario by using hard limits on the cgroup running on CPU1.

I agree with soft limit implementation is difficult.

but I still don't like soft limit in __alloc_pages_internal().
if it does, kswapd reclaim the pages from global LRU *before*
shrinking soft limit.

again, linux reclaim policy is

free < pages_low: run kswapd
free < pages_min: foreground reclaim via __alloc_pages_internal()

then, if soft limit reclaim put into __alloc_pages_internal(),

free < pages_low: run kswapd
free < pages_min: soft limit reclaim and
foreground reclaim via __alloc_pages_internal()

it seems unintetional behavior.

In addition, I still strongly oppose againt global lock although
soft limit shrinking don't put into __alloc_pages_internal().
I think it doesn't depend on caller place.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/