Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
From: Mel Gorman
Date: Mon Feb 02 2015 - 17:18:33 EST
On Mon, Feb 02, 2015 at 02:05:06PM -0800, Andrew Morton wrote:
> On Mon, 2 Feb 2015 16:55:25 +0000 Mel Gorman <mgorman@xxxxxxx> wrote:
>
> > glibc malloc changed behaviour in glibc 2.10 to have per-thread arenas
> > instead of creating new areans if the existing ones were contended.
> > The decision appears to have been made so the allocator scales better but the
> > downside is that madvise(MADV_DONTNEED) is now called for these per-thread
> > areans during free. This tears down pages that would have previously
> > remained. There is nothing wrong with this decision from a functional point
> > of view but any threaded application that frequently allocates/frees the
> > same-sized region is going to incur the full teardown and refault costs.
>
> MADV_DONTNEED has been there for many years. How could this problem
> not have been noticed during glibc 2.10 development/testing?
I do not know. I only spotted it due to switching distributions. Looping
allocations and frees of the same sizes is considered inefficient and it
might have been dismissed on those grounds. It's probably less noticeable
when it only affects threaded applications.
> Is there
> some more recent kernel change which is triggering this?
>
Not that I'm aware of.
> > This patch identifies when a thread is frequently calling MADV_DONTNEED
> > on the same region of memory and starts ignoring the hint.
>
> That's pretty nasty-looking :(
>
Yep, it is but we're very limited in terms of what we can do within the
kernel here.
> And presumably there are all sorts of behaviours which will still
> trigger the problem but which will avoid the start/end equality test in
> ignore_madvise_hint()?
>
Yes. I would expect that a simple pattern of multiple allocs followed by
multiple frees in a loop would also trigger it.
> Really, this is a glibc problem and only a glibc problem.
> MADV_DONTNEED is unavoidably expensive and glibc is calling
> MADV_DONTNEED for a region which it *does* need.
To be fair to glibc, it calls it on a region it *thinks* it doesn't need only
to reuse it immediately afterwards because of how the benchmark is
implemented.
> Is there something
> preventing this from being addressed within glibc?
I doubt it other than I expect they'll punt it back and blame either the
application for being stupid or the kernel for being slow.
--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/