Re: [RFC PATCH 3/3] mm: support active anti-fragmentation algorithm

From: Joonsoo Kim
Date: Tue May 19 2015 - 04:04:06 EST


On Tue, May 12, 2015 at 11:01:48AM +0200, Vlastimil Babka wrote:
> On 04/28/2015 09:45 AM, Joonsoo Kim wrote:
> >On Mon, Apr 27, 2015 at 09:29:23AM +0100, Mel Gorman wrote:
> >>On Mon, Apr 27, 2015 at 04:23:41PM +0900, Joonsoo Kim wrote:
> >>>We already have antifragmentation policy in page allocator. It works well
> >>>when system memory is sufficient, but, it doesn't works well when system
> >>>memory isn't sufficient because memory is already highly fragmented and
> >>>fallback/steal mechanism cannot get whole pageblock. If there is severe
> >>>unmovable allocation requestor like zram, problem could get worse.
> >>>
> >>>CPU: 8
> >>>RAM: 512 MB with zram swap
> >>>WORKLOAD: kernel build with -j12
> >>>OPTION: page owner is enabled to measure fragmentation
> >>>After finishing the build, check fragmentation by 'cat /proc/pagetypeinfo'
> >>>
> >>>* Before
> >>>Number of blocks type (movable)
> >>>DMA32: 207
> >>>
> >>>Number of mixed blocks (movable)
> >>>DMA32: 111.2
> >>>
> >>>Mixed blocks means that there is one or more allocated page for
> >>>unmovable/reclaimable allocation in movable pageblock. Results shows that
> >>>more than half of movable pageblock is tainted by other migratetype
> >>>allocation.
> >>>
> >>>To mitigate this fragmentation, this patch implements active
> >>>anti-fragmentation algorithm. Idea is really simple. When some
> >>>unmovable/reclaimable steal happens from movable pageblock, we try to
> >>>migrate out other pages that can be migratable in this pageblock are and
> >>>use these generated freepage for further allocation request of
> >>>corresponding migratetype.
> >>>
> >>>Once unmovable allocation taints movable pageblock, it cannot easily
> >>>recover. Instead of praying that it gets restored, making it unmovable
> >>>pageblock as much as possible and using it further unmovable request
> >>>would be more reasonable approach.
> >>>
> >>>Below is result of this idea.
> >>>
> >>>* After
> >>>Number of blocks type (movable)
> >>>DMA32: 208.2
> >>>
> >>>Number of mixed blocks (movable)
> >>>DMA32: 55.8
> >>>
> >>>Result shows that non-mixed block increase by 59% in this case.
>
> Interesting. I tested a patch prototype like this too (although the
> work wasn't offloaded to a kthread, I wanted to see benefits first)
> and it yielded no significant difference. But admittedly I was using
> stress-highalloc for huge page sized allocations and a 4GB memory
> system...

Okay.

>
> So with these results it seems definitely worth pursuing, taking
> Mel's comments into account. We should think about coordination with
> khugepaged, which is another source of compaction. See my patchset
> from yesterday "Outsourcing page fault THP allocations to
> khugepaged" (sorry I didn't CC you). I think ideally this "antifrag"

I will check it.

> or maybe "kcompactd" thread would be one per NUMA node and serve
> both for the pageblock antifragmentation requests (with higher

Before, I tried an idea that create one kantifragd per node. Sometimes,
anti-fragmentation requests are crushed into the thread so the thread
can't handle it in time. With using workqueue, I can spread the work
to all cpus so this problem is reduced. But, it's the policy that
how we spend our time for anti-fragmentation work so one thread
per node would be enough.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/