Re: [RFC] mm: a question about high-order check in __zone_watermark_ok()

From: Michal Hocko
Date: Mon Sep 26 2016 - 04:58:58 EST


On Mon 26-09-16 16:47:57, Xishi Qiu wrote:
> commit 97a16fc82a7c5b0cfce95c05dfb9561e306ca1b1
> (mm, page_alloc: only enforce watermarks for order-0 allocations)
> rewrite the high-order check in __zone_watermark_ok(), but I think it
> quietly fix a bug. Please see the following.
>
> Before this patch, the high-order check is this:
> __zone_watermark_ok()
> ...
> for (o = 0; o < order; o++) {
> /* At the next order, this order's pages become unavailable */
> free_pages -= z->free_area[o].nr_free << o;
>
> /* Require fewer higher order pages to be free */
> min >>= 1;
>
> if (free_pages <= min)
> return false;
> }
> ...
>
> If we have cma memory, and we alloc a high-order movable page, then it's right.
>
> But if we alloc a high-order unmovable page(e.g. alloc kernel stack in dup_task_struct()),
> and there are a lot of high-order cma pages, but little high-order unmovable
> pages, the it is still return *true*, but we will alloc *failed* finally, because
> we cannot fallback from migrate_unmovable to migrate_cma, right?

AFAIR CMA wmark check was always tricky and the above commit has made
the situation at least a bit more clear. Anyway IIRC

#ifdef CONFIG_CMA
/* If allocation can't use CMA areas don't use free CMA pages */
if (!(alloc_flags & ALLOC_CMA))
free_cma = zone_page_state(z, NR_FREE_CMA_PAGES);
#endif

if (free_pages - free_cma <= min + z->lowmem_reserve[classzone_idx])
return false;

should reduce the prioblem because a lot of CMA pages should just get us
below the wmark + reserve boundary.
--
Michal Hocko
SUSE Labs