On Fri, 14 Jun 2013, Christoph Lameter wrote:
It's possible to avoid such problems (or at least to make them less probable)
by avoiding direct compaction. If it's not possible to allocate a contiguous
page without compaction, slub will fall back to order 0 page(s). In this case
kswapd will be woken to perform asynchronous compaction. So, slub can return
to default order allocations as soon as memory will be de-fragmented.
Sounds like a good idea. Do you have some numbers to show the effect of
this patch?
I'm surprised you like this patch, it basically makes slub allocations to
be atomic and doesn't try memory compaction nor reclaim. Asynchronous
compaction certainly isn't aggressive enough to mimick the effects of the
old lumpy reclaim that would have resulted in less fragmented memory. If
slub is the only thing that is doing high-order allocations, it will start
falling back to the smallest page order much much more often.
I agree that this doesn't seem like a slub issue at all but rather a page
allocator issue; if we have many simultaneous thp faults at the same time
and /sys/kernel/mm/transparent_hugepage/defrag is "always" then you'll get
the same problem if deferred compaction isn't helping.
So I don't think we should be patching slub in any special way here.
Roman, are you using the latest kernel? If so, what does
grep compact_ /proc/vmstat show after one or more of these events?