Re: [patch] mm, thp: always direct reclaim for MADV_HUGEPAGE even when deferred

From: Michal Hocko
Date: Wed Jan 04 2017 - 04:47:05 EST


On Wed 04-01-17 09:32:55, Vlastimil Babka wrote:
> On 01/03/2017 11:44 PM, David Rientjes wrote:
> > On Mon, 2 Jan 2017, Vlastimil Babka wrote:
[...]
> >>> echo "defer madvise" > /sys/kernel/mm/transparent_hugepage/defrag
> >>> cat /sys/kernel/mm/transparent_hugepage/defrag
> >> always [defer] [madvise] never
> >>
> >> I'm not sure about the analogous kernel boot option though, I guess
> >> those can't use spaces, so maybe comma-separated?
>
> No opinion on the above? I think it could be somewhat more elegant than
> a fifth-option that Mel said he would prefer, and deliver the same
> flexibility.

I am not sure we have considered the kcompactd watermark option
throughly as well. In case the relation is not clear because I admit
that the propsal was scattered in more emails. So let me summarize it
here.

Let's add a system configuration whih would control the pro-active
background compaction which would
- wake up kcompactd pro-actively even when there is no immediate
memory pressure - based on the timeout
- keep compacting as long as the requested order is under the
configured watermark and the compaction makes further
progress.

Admin can set up this tunable to reflect demand for the THP in the
particular workload. Now how it would play with the THP specific defrag
options?
- never - THP allocations will be tried without any feedback to
kcopactd - no stalls in the page fault path
- defer - THP allocations will be tried and kcompactd woken up
outside of its wmark setting to catch with the workload - no
stalls in the page fault path
- madvise - do the direct compaction for madvised VMAs and rely
on kcompactd watermarks setting to do the background
compaction
- always - do the direct compaction for all VMAs

We won't have to add or modify any new THP specific option and we will
have a generic user independent tunable to tell that the system should
try to generate high order pages which is something that is demand for.
Such a solution would be more flexible as well because the configuration
could reflect the demand much better.

Is there any reason, except for not being implemented yet, that would
make it inappropriate for the described usecase?
--
Michal Hocko
SUSE Labs