RE: mm: pages are not freed from lru_add_pvecs after process termination

From: Odzioba, Lukasz
Date: Wed May 04 2016 - 15:42:08 EST


On Thu 02-05-16 03:00:00, Michal Hocko wrote:
> So I have given this a try (not tested yet) and it doesn't look terribly
> complicated. It is hijacking vmstat for a purpose it wasn't intended for
> originally but creating a dedicated kenrnel threads/WQ sounds like an
> overkill to me. Does this helps or do we have to be more aggressive and
> wake up shepherd from the allocator slow path. Could you give it a try
> please?

It seems to work fine, but it takes quite random time to drain lists, sometimes
a couple of seconds sometimes over two minutes. It is acceptable I believe.

I have an app which allocates almost all of the memory from numa node and
with just second patch and 100 consecutive executions 30-50% got killed.
After applying also your first patch I haven't seen any oom kill activity - great.

I was wondering how many lru_add_drain()'s are called and after boot when
machine was idle it was a bit over 5k calls during first 400s, and with some
activity it went up to 15k calls during 700s (including 5k from previous
experiment) which sounds fair to me given big cpu count.

Do you see any advantages of dropping THP from pagevecs over this solution?

Thanks,
Lukas