Re: [PATCH v7 00/12] Support non-lru page migration
From: Daniel Vetter
Date: Wed Jun 01 2016 - 18:40:22 EST
On Wed, Jun 01, 2016 at 02:41:51PM -0700, Andrew Morton wrote:
> On Wed, 1 Jun 2016 08:21:09 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:
>
> > Recently, I got many reports about perfermance degradation in embedded
> > system(Android mobile phone, webOS TV and so on) and easy fork fail.
> >
> > The problem was fragmentation caused by zram and GPU driver mainly.
> > With memory pressure, their pages were spread out all of pageblock and
> > it cannot be migrated with current compaction algorithm which supports
> > only LRU pages. In the end, compaction cannot work well so reclaimer
> > shrinks all of working set pages. It made system very slow and even to
> > fail to fork easily which requires order-[2 or 3] allocations.
> >
> > Other pain point is that they cannot use CMA memory space so when OOM
> > kill happens, I can see many free pages in CMA area, which is not
> > memory efficient. In our product which has big CMA memory, it reclaims
> > zones too exccessively to allocate GPU and zram page although there are
> > lots of free space in CMA so system becomes very slow easily.
>
> But this isn't presently implemented for GPU drivers or for CMA, yes?
>
> What's the story there?
Broken (out-of-tree) drivers that don't allocate their gpu stuff
correctly. There's piles of drivers that get_user_page all over the place
but then fail to timely get off these pages again. The fix is to get off
those pages again (either by unpinning timely, or registering an
mmu_notifier if the driver wants to keep the pages pinned indefinitely, as
a caching optimization).
At least that's my guess, and iirc it was confirmed first time around this
series showed up.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch