Re: [PATCH 28/28] mm, page_alloc: Defer debugging checks of pages allocated from the PCP

From: Vlastimil Babka
Date: Wed May 18 2016 - 03:52:06 EST


On 05/17/2016 08:41 AM, Naoya Horiguchi wrote:
>> @@ -2579,20 +2612,22 @@ struct page *buffered_rmqueue(struct zone *preferred_zone,
>> struct list_head *list;
>>
>> local_irq_save(flags);
>> - pcp = &this_cpu_ptr(zone->pageset)->pcp;
>> - list = &pcp->lists[migratetype];
>> - if (list_empty(list)) {
>> - pcp->count += rmqueue_bulk(zone, 0,
>> - pcp->batch, list,
>> - migratetype, cold);
>> - if (unlikely(list_empty(list)))
>> - goto failed;
>> - }
>> + do {
>> + pcp = &this_cpu_ptr(zone->pageset)->pcp;
>> + list = &pcp->lists[migratetype];
>> + if (list_empty(list)) {
>> + pcp->count += rmqueue_bulk(zone, 0,
>> + pcp->batch, list,
>> + migratetype, cold);
>> + if (unlikely(list_empty(list)))
>> + goto failed;
>> + }
>>
>> - if (cold)
>> - page = list_last_entry(list, struct page, lru);
>> - else
>> - page = list_first_entry(list, struct page, lru);
>> + if (cold)
>> + page = list_last_entry(list, struct page, lru);
>> + else
>> + page = list_first_entry(list, struct page, lru);
>> + } while (page && check_new_pcp(page));
>
> This causes infinite loop when check_new_pcp() returns 1, because the bad
> page is still in the list (I assume that a bad page never disappears).
> The original kernel is free from this problem because we do retry after
> list_del(). So moving the following 3 lines into this do-while block solves
> the problem?
>
> __dec_zone_state(zone, NR_ALLOC_BATCH);
> list_del(&page->lru);
> pcp->count--;
>
> There seems no infinit loop issue in order > 0 block below, because bad pages
> are deleted from free list in __rmqueue_smallest().

Ooops, thanks for catching this, wish it was sooner...

----8<----