Re: [PATCH 5/8] mm: move lazily freed pages to inactive list

From: Minchan Kim
Date: Mon Nov 02 2015 - 19:52:38 EST


On Fri, Oct 30, 2015 at 10:22:12AM -0700, Shaohua Li wrote:
> On Fri, Oct 30, 2015 at 04:01:41PM +0900, Minchan Kim wrote:
> > MADV_FREE is a hint that it's okay to discard pages if there is memory
> > pressure and we use reclaimers(ie, kswapd and direct reclaim) to free them
> > so there is no value keeping them in the active anonymous LRU so this
> > patch moves them to inactive LRU list's head.
> >
> > This means that MADV_FREE-ed pages which were living on the inactive list
> > are reclaimed first because they are more likely to be cold rather than
> > recently active pages.
> >
> > An arguable issue for the approach would be whether we should put the page
> > to the head or tail of the inactive list. I chose head because the kernel
> > cannot make sure it's really cold or warm for every MADV_FREE usecase but
> > at least we know it's not *hot*, so landing of inactive head would be a
> > comprimise for various usecases.
> >
> > This fixes suboptimal behavior of MADV_FREE when pages living on the
> > active list will sit there for a long time even under memory pressure
> > while the inactive list is reclaimed heavily. This basically breaks the
> > whole purpose of using MADV_FREE to help the system to free memory which
> > is might not be used.
>
> My main concern is the policy how we should treat the FREE pages. Moving it to
> inactive lru is definitionly a good start, I'm wondering if it's enough. The
> MADV_FREE increases memory pressure and cause unnecessary reclaim because of
> the lazy memory free. While MADV_FREE is intended to be a better replacement of
> MADV_DONTNEED, MADV_DONTNEED doesn't have the memory pressure issue as it free
> memory immediately. So I hope the MADV_FREE doesn't have impact on memory
> pressure too. I'm thinking of adding an extra lru list and wartermark for this
> to make sure FREE pages can be freed before system wide page reclaim. As you
> said, this is arguable, but I hope we can discuss about this issue more.

Yes, it's arguble. ;-)

It seems the divergence comes from MADV_FREE is *replacement* of MADV_DONTNEED.
But I don't think so. If we could discard MADV_FREEed page *anytime*, I agree
but it's not true because the page would be dirty state when VM want to reclaim.

I'm also against with your's suggestion which let's discard FREEed page before
system wide page reclaim because system would have lots of clean cold page
caches or anonymous pages. In such case, reclaiming of them would be better.
Yeb, it's really workload-dependent so we might need some heuristic which is
normally what we want to avoid.

Having said that, I agree with you we could do better than the deactivation
and frankly speaking, I'm thinking of another LRU list(e.g. tentatively named
"ezreclaim LRU list"). What I have in mind is to age (anon|file|ez)
fairly. IOW, I want to percolate ez-LRU list reclaiming into get_scan_count.
When the MADV_FREE is called, we could move hinted pages from anon-LRU to
ez-LRU and then If VM find to not be able to discard a page in ez-LRU,
it could promote it to acive-anon-LRU which would be very natural aging
concept because it mean someone touches the page recenlty.

With that, I don't want to bias one side and don't want to add some knob for
tuning the heuristic but let's rely on common fair aging scheme of VM.

Another bonus with new LRU list is we could support MADV_FREE on swapless
system.

>
> Or do you want to push this first and address the policy issue later?

I believe adding new LRU list would be controversial(ie, not trivial)
for maintainer POV even though code wouldn't be complicated.
So, I want to see problems in *real practice*, not any theoritical
test program before diving into that.
To see such voice of request, we should release the syscall.
So, I want to push this first.

>
> Thanks,
> Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/