Re: [PATCH] mm: skip the page buddy block instead of one page

From: Minchan Kim
Date: Thu Aug 15 2013 - 09:19:51 EST


Hi Mel,

On Thu, Aug 15, 2013 at 12:30:19PM +0100, Mel Gorman wrote:
> On Thu, Aug 15, 2013 at 01:17:55PM +0900, Minchan Kim wrote:
> > Hello,
> >
>
> Well, this thread managed to get out of control for no good reason!
>
> > > > <SNIP>
> > > > So, what's the result by that?
> > > > As I said, it's just skipping (pageblock_nr_pages -1) at worst case
> > >
> > > Hi Minchan,
> > > I mean if the private is set to a large number, it will skip 2^private
> > > pages, not (pageblock_nr_pages -1). I find somewhere will use page->private,
> > > such as fs. Here is the comment about parivate.
> > > /* Mapping-private opaque data:
> > > * usually used for buffer_heads
> > > * if PagePrivate set; used for
> > > * swp_entry_t if PageSwapCache;
> > > * indicates order in the buddy
> > > * system if PG_buddy is set.
> > > */
> >
> > Please read full thread in detail.
> >
> > Mel suggested following as
> >
> > if (PageBuddy(page)) {
> > int nr_pages = (1 << page_order(page)) - 1;
> > if (PageBuddy(page)) {
> > nr_pages = min(nr_pages, MAX_ORDER_NR_PAGES - 1);
> > low_pfn += nr_pages;
> > continue;
> > }
> > }
> >
> > min(nr_pages, xxx) removes your concern but I think Mel's version
> > isn't right. It should be aligned with pageblock boundary so I
> > suggested following.
> >
>
> Why? We're looking for pages to migrate. If the page is free and at the
> maximum order then there is no point searching in the middle of a free
> page.

isolate_migratepages_range API works with [low_pfn, end_pfn)
and we can't guarantee page_order in normal compaction path
so I'd like to limit the skipping by end_pfn conservatively.

>
> > if (PageBuddy(page)) {
> > #ifdef CONFIG_MEMORY_ISOLATION
> > unsigned long order = page_order(page);
> > if (PageBuddy(page)) {
> > low_pfn += (1 << order) - 1;
> > low_pfn = min(low_pfn, end_pfn);
> > }
> > #endif
> > continue;
> > }
> >
> > so worst case is (pageblock_nr_pages - 1).
>
> No it isn't. The worst case it that the whole region being searched is
> skipped. For THP allocations, it would happen to work as being the
> pageblock boundary but it is not required by the API. I expect that
> end_pfn is not necessarily the next pageblock boundary for CMA
> allocations.

Mel, as I said eariler, CMA and memory-hotplug don't have a race
problem of page_order so we can consider only normal compaction path
like high order allocation(ex, THP). So, about this race problem,
worst case is the number of (pageblock_nr_pages - 1) skipping.

>
> --
> Mel Gorman
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/