Re: [PATCH 44 of 66] skip transhuge pages in ksm for now

From: Andrea Arcangeli
Date: Thu Dec 09 2010 - 13:15:33 EST


On Thu, Nov 18, 2010 at 04:06:13PM +0000, Mel Gorman wrote:
> On Wed, Nov 03, 2010 at 04:28:19PM +0100, Andrea Arcangeli wrote:
> > From: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> >
> > Skip transhuge pages in ksm for now.
> >
> > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>
>
> Acked-by: Mel Gorman <mel@xxxxxxxxx>
>
> This is an idle concern that I haven't looked into but is there any conflict
> between khugepaged scanning the KSM scanning?
>
> Specifically, I *think* the impact of this patch is that KSM will not
> accidentally split a huge page. Is that right? If so, it could do with
> being included in the changelog.

KSM wasn't aware about hugepages and in turn it'd never split them
anyway. We want KSM to split hugepages only when if finds two equal
subpages. That will happen later.

Right now there is no collision of ksmd and khugepaged, regular pages,
hugepages and ksm pages will co-exist fine in the same vma. The only
problem is that the system has now to start swapping before KSM has a
chance to find equal pages and we'll fix it in the future so KSM can
scan inside hugepages too and split them and merge the subpages as
needed before the memory pressure starts.

> On the other hand, can khugepaged be prevented from promoting a hugepage
> because of KSM?

Sure, khugepaged won't promote if there's any ksm page in the
range. That's not going to change. When KSM is started, the priority
remains in saving memory. If people uses enabled=madvise and
MADV_HUGEPAGE+MADV_MERGEABLE there is actually zero memory loss
because of THP and there is a speed improvement for all pages that
aren't equal. So it's an ideal setup even for embedded. Regular cloud
setup would be enabled=always + MADV_MERGEABLE (with enabled=always
MADV_HUGEPAGE becomes a noop).

On a related note I'm also going to introduce a MADV_NO_HUGEPAGE, is
that a good name for it? cloud management wants to be able to disable
THP per-VM basis (when the VM are totally idle, and low priority, this
currently also helps to maximize the power of KSM that would otherwise
be activated only after initial sawpping, but the KSM part will be
fixed). It could be achieved also with enabled=madvise and
MADV_HUGEPAGE but we don't want to change the system wide default in
order to disable THP on a per-VM basis: it's much nicer if the default
behavior of the host remains the same in case it's not a pure
hypervisor usage but there are other loads running in parallel to the
virt load. In theory a prctl(PR_NO_HUGEPAGE) could also do it and it'd
be possible to use from a wrapper (madvise can't be wrapped), but I
think MADV_NO_HUGEPAGE is cleaner and it won't require brand new
per-process info.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/