Re: [PATCH v3 03/16] mm: add non-lru movable page support document

From: Minchan Kim
Date: Wed Apr 06 2016 - 22:27:07 EST


On Mon, Apr 04, 2016 at 03:09:22PM +0200, Vlastimil Babka wrote:
> On 04/04/2016 04:25 AM, Minchan Kim wrote:
> >>
> >>Ah, I see, so it's designed with page lock to handle the concurrent isolations etc.
> >>
> >>In http://marc.info/?l=linux-mm&m=143816716511904&w=2 Mel has warned
> >>about doing this in general under page_lock and suggested that each
> >>user handles concurrent calls to isolate_page() internally. Might be
> >>more generic that way, even if all current implementers will
> >>actually use the page lock.
> >
> >We need PG_lock for two reasons.
> >
> >Firstly, it guarantees page's flags operation(i.e., PG_movable, PG_isolated)
> >atomicity. Another thing is for stability for page->mapping->a_ops.
> >
> >For example,
> >
> >isolate_migratepages_block
> > if (PageMovable(page))
> > isolate_movable_page
> > get_page_unless_zero <--- 1
> > trylock_page
> > page->mapping->a_ops->isolate_page <--- 2
> >
> >Between 1 and 2, driver can nullify page->mapping so we need PG_lock
>
> Hmm I see, that really doesn't seem easily solvable without page_lock.
> My idea is that compaction code would just check PageMovable() and
> PageIsolated() to find a candidate.
> page->mapping->a_ops->isolate_page would do the driver-specific
> necessary locking, revalidate if the page state and succeed
> isolation, or fail. It would need to handle the possibility that the

So you mean that VM can try to isolate false-positive page of the driver?
I don't think it's a good idea. For handling that, every driver should
keep some logics to handle such false-positive which needs each own
data structure or something to remember the page passed from VM
is valid or not. It makes driver's logic more complicated and need
more codes to handle it. It's not a good deal.

> page already doesn't belong to the mapping, which is probably not a
> problem. But what if the driver is a module that was already
> unloaded, and even though we did NULL-check each part from page to
> isolate_page, it points to a function that's already gone? That
> would need some extra handling to prevent that, hm...

Yes, driver should clean up pages is is using. For it, we need some lock.
I think page_lock is good for it because we are migrating *page* and page_lock
have been used it for a long time in migration path.