Re: [RFC PATCH v2 0/4] mm: reclaim zbud pages on migration andcompaction

From: Minchan Kim
Date: Sun Aug 11 2013 - 23:50:37 EST


Hello Benjamin,

On Sun, Aug 11, 2013 at 11:16:47PM -0400, Benjamin LaHaise wrote:
> Hello Minchan,
>
> On Mon, Aug 12, 2013 at 11:25:35AM +0900, Minchan Kim wrote:
> > Hello,
> >
> > On Fri, Aug 09, 2013 at 12:22:16PM +0200, Krzysztof Kozlowski wrote:
> > > Hi,
> > >
> > > Currently zbud pages are not movable and they cannot be allocated from CMA
> > > region. These patches try to address the problem by:
> >
> > The zcache, zram and GUP pages for memory-hotplug and/or CMA are
> > same situation.
> >
> > > 1. Adding a new form of reclaim of zbud pages.
> > > 2. Reclaiming zbud pages during migration and compaction.
> > > 3. Allocating zbud pages with __GFP_RECLAIMABLE flag.
> >
> > So I'd like to solve it with general approach.
> >
> > Each subsystem or GUP caller who want to pin pages long time should
> > create own migration handler and register the page into pin-page
> > control subsystem like this.
> >
> > driver/foo.c
> >
> > int foo_migrate(struct page *page, void *private);
> >
> > static struct pin_page_owner foo_migrate = {
> > .migrate = foo_migrate;
> > };
> >
> > int foo_allocate()
> > {
> > struct page *newpage = alloc_pages();
> > set_pinned_page(newpage, &foo_migrate);
> > }
> >
> > And in compaction.c or somewhere where want to move/reclaim the page,
> > general VM can ask to owner if it founds it's pinned page.
> >
> > mm/compaction.c
> >
> > if (PagePinned(page)) {
> > struct pin_page_info *info = get_page_pin_info(page);
> > info->migrate(page);
> >
> > }
> >
> > Only hurdle for that is that we should introduce a new page flag and
> > I believe if we all agree this approch, we can find a solution at last.
> >
> > What do you think?
>
> I don't like this approach. There will be too many collisions in the
> hash that's been implemented (read: I don't think you can get away with

Yeb. That's why I'd like to change it with radix tree of pfn as
I mentioned as comment(just used hash for fast prototyping without big
considering).

> a naive implementation for core infrastructure that has to suite all
> users), you've got a global spin lock, and it doesn't take into account

I think batching-drain of pinned page would be sufficient for avoiding
global spinlock problem because we have been used it with page-allocator
which is one of most critical hotpath.

> NUMA issues. The address space migratepage method doesn't have those

NUMA issues? Could you elaborate it a bit?

> issues (at least where it is usable as in aio's use-case).
>
> If you're going to go down this path, you'll have to decide if *all* users
> of pinned pages are going to have to subscribe to supporting the un-pinning
> of pages, and that means taking a real hard look at how O_DIRECT pins pages.
> Once you start thinking about that, you'll find that addressing the
> performance concerns is going to be an essential part of any design work to
> be done in this area.

True. The patch I included just shows the cocnept so I didn't consider any
performance critical part but if we all agree this arpproch does make sense
and we can implement little overhead, I will step into next phase to enhance
performance.

Thanks for the input, Ben!

>
> -ben
> --
> "Thought is the essence of where you are now."
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/