RE: mm: pages are not freed from lru_add_pvecs after process termination

From: Odzioba, Lukasz
Date: Thu May 05 2016 - 13:25:29 EST


On Thu 05-05-16 09:21:00, Michal Hocko wrote:
> OK, it wasn't that tricky afterall. Maybe I have missed something but
> the following should work. Or maybe the async nature of flushing turns
> out to be just impractical and unreliable and we will end up skipping
> THP (or all compound pages) for pcp LRU add cache. Let's see...

Initially this issue was found on RH's 3.10.x kernel, but now I am using
4.6-rc6.

In overall it does help and under heavy load it is slightly better than the
second patch. Unfortunately I am still able to hit 10-20% oom kills with it -
(went down from 30-50%) partially due to earlier vmstat_update call
- it went up to 25-25% with this patch below:

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b4359f8..7a5ab0d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3264,17 +3264,17 @@ retry:
if (!is_thp_gfp_mask(gfp_mask) || (current->flags & PF_KTHREAD))
migration_mode = MIGRATE_SYNC_LIGHT;

- if(!vmstat_updated) {
- vmstat_updated = true;
- kick_vmstat_update();
- }
-
/* Try direct reclaim and then allocating */
page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
&did_some_progress);
if (page)
goto got_pg;

+ if(!vmstat_updated) {
+ vmstat_updated = true;
+ kick_vmstat_update();
+ }

I don't quite see an uninvasive way to make sure that we drain all pvecs
before failing allocation and doing it asynchronously will race allocations
anyway - I guess.

Thanks,
Lukas