Re: [PATCH v2 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used
From: Gregory Fong
Date: Mon Jan 05 2015 - 23:02:21 EST
+linux-mm and linux-kernel (not sure how those got removed from cc,
sorry about that)
On Mon, Jan 5, 2015 at 7:58 PM, Gregory Fong <gregory.0xf0@xxxxxxxxx> wrote:
> Hi Joonsoo,
>
> On Wed, May 28, 2014 at 12:04 AM, Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> wrote:
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 674ade7..ca678b6 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -788,6 +788,56 @@ void __init __free_pages_bootmem(struct page *page, unsigned int order)
>> }
>>
>> #ifdef CONFIG_CMA
>> +void adjust_managed_cma_page_count(struct zone *zone, long count)
>> +{
>> + unsigned long flags;
>> + long total, cma, movable;
>> +
>> + spin_lock_irqsave(&zone->lock, flags);
>> + zone->managed_cma_pages += count;
>> +
>> + total = zone->managed_pages;
>> + cma = zone->managed_cma_pages;
>> + movable = total - cma - high_wmark_pages(zone);
>> +
>> + /* No cma pages, so do only movable allocation */
>> + if (cma <= 0) {
>> + zone->max_try_movable = pageblock_nr_pages;
>> + zone->max_try_cma = 0;
>> + goto out;
>> + }
>> +
>> + /*
>> + * We want to consume cma pages with well balanced ratio so that
>> + * we have consumed enough cma pages before the reclaim. For this
>> + * purpose, we can use the ratio, movable : cma. And we doesn't
>> + * want to switch too frequently, because it prevent allocated pages
>> + * from beging successive and it is bad for some sorts of devices.
>> + * I choose pageblock_nr_pages for the minimum amount of successive
>> + * allocation because it is the size of a huge page and fragmentation
>> + * avoidance is implemented based on this size.
>> + *
>> + * To meet above criteria, I derive following equation.
>> + *
>> + * if (movable > cma) then; movable : cma = X : pageblock_nr_pages
>> + * else (movable <= cma) then; movable : cma = pageblock_nr_pages : X
>> + */
>> + if (movable > cma) {
>> + zone->max_try_movable =
>> + (movable * pageblock_nr_pages) / cma;
>> + zone->max_try_cma = pageblock_nr_pages;
>> + } else {
>> + zone->max_try_movable = pageblock_nr_pages;
>> + zone->max_try_cma = cma * pageblock_nr_pages / movable;
>
> I don't know if anyone's already pointed this out (didn't see anything
> when searching lkml), but while testing this, I noticed this can
> result in a div by zero under memory pressure (movable becomes 0).
> This is not unlikely when the majority of pages are in CMA regions
> (this may seem pathological but we do actually do this right now).
>
> [ 0.249674] Division by zero in kernel.
> [ 0.249682] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
> 3.14.13-1.3pre-00368-g4d90957-dirty #10
> [ 0.249710] [<c001619c>] (unwind_backtrace) from [<c0011fa4>]
> (show_stack+0x10/0x14)
> [ 0.249725] [<c0011fa4>] (show_stack) from [<c0538d6c>]
> (dump_stack+0x80/0x90)
> [ 0.249740] [<c0538d6c>] (dump_stack) from [<c025e9d0>] (Ldiv0+0x8/0x10)
> [ 0.249751] [<c025e9d0>] (Ldiv0) from [<c0094ba4>]
> (adjust_managed_cma_page_count+0x64/0xd8)
> [ 0.249762] [<c0094ba4>] (adjust_managed_cma_page_count) from
> [<c00cb2f4>] (cma_release+0xa8/0xe0)
> [ 0.249776] [<c00cb2f4>] (cma_release) from [<c0721698>]
> (cma_drvr_probe+0x378/0x470)
> [ 0.249787] [<c0721698>] (cma_drvr_probe) from [<c02ce9cc>]
> (platform_drv_probe+0x18/0x48)
> [ 0.249799] [<c02ce9cc>] (platform_drv_probe) from [<c02ccfb0>]
> (driver_probe_device+0xac/0x3a4)
> [ 0.249808] [<c02ccfb0>] (driver_probe_device) from [<c02cd378>]
> (__driver_attach+0x8c/0x90)
> [ 0.249817] [<c02cd378>] (__driver_attach) from [<c02cb390>]
> (bus_for_each_dev+0x60/0x94)
> [ 0.249825] [<c02cb390>] (bus_for_each_dev) from [<c02cc674>]
> (bus_add_driver+0x15c/0x218)
> [ 0.249834] [<c02cc674>] (bus_add_driver) from [<c02cd9a0>]
> (driver_register+0x78/0xf8)
> [ 0.249841] [<c02cd9a0>] (driver_register) from [<c02cea24>]
> (platform_driver_probe+0x20/0xa4)
> [ 0.249849] [<c02cea24>] (platform_driver_probe) from [<c0008958>]
> (do_one_initcall+0xd4/0x17c)
> [ 0.249857] [<c0008958>] (do_one_initcall) from [<c0719d00>]
> (kernel_init_freeable+0x13c/0x1dc)
> [ 0.249864] [<c0719d00>] (kernel_init_freeable) from [<c0534578>]
> (kernel_init+0x8/0xe8)
> [ 0.249873] [<c0534578>] (kernel_init) from [<c000ed78>]
> (ret_from_fork+0x14/0x3c)
>
> Could probably just add something above similar to the "no cma pages" case, like
>
> /* No movable pages, so only do CMA allocation */
> if (movable <= 0) {
> zone->max_try_cma = pageblock_nr_pages;
> zone->max_try_movable = 0;
> goto out;
> }
>
>> + }
>> +
>> +out:
>> + zone->nr_try_movable = zone->max_try_movable;
>> + zone->nr_try_cma = zone->max_try_cma;
>> +
>> + spin_unlock_irqrestore(&zone->lock, flags);
>> +}
>> +
>
> Best regards,
> Gregory
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/