Re: [PATCH 1/2] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

From: Michal Hocko
Date: Mon Oct 29 2018 - 06:08:39 EST


On Mon 29-10-18 20:42:53, Balbir Singh wrote:
> On Mon, Oct 29, 2018 at 10:00:35AM +0100, Michal Hocko wrote:
[...]
> > These hugetlb allocations might be disruptive and that is an expected
> > behavior because this is an explicit requirement from an admin to
> > pre-allocate large pages for the future use. __GFP_RETRY_MAYFAIL just
> > underlines that requirement.
>
> Yes, but in the absence of a particular node, for example via sysctl
> (as the compaction does), I don't think it is a hard requirement to get
> a page from a particular node.

Again this seems like a deliberate decision. You want your distributions
as even as possible otherwise the NUMA placement will be much less
deterministic. At least that was the case for a long time. If you
have different per-node preferences, just use NUMA aware pre-allocation.

> I agree we need __GFP_RETRY_FAIL, in any
> case the real root cause for me is should_reclaim_continue() which keeps
> the task looping without making forward progress.

This seems like a separate issue which should better be debugged. Please
open a new thread describing the problem and the state of the node.

--
Michal Hocko
SUSE Labs