Re: [PATCH v2] mm: cma: indefinitely retry allocations in cma_alloc

From: Randy Dunlap
Date: Fri Sep 11 2020 - 17:42:29 EST


On 9/11/20 2:37 PM, Florian Fainelli wrote:
>
>
> On 9/11/2020 1:54 PM, Chris Goldsworthy wrote:
>> CMA allocations will fail if 'pinned' pages are in a CMA area, since we
>> cannot migrate pinned pages. The _refcount of a struct page being greater
>> than _mapcount for that page can cause pinning for anonymous pages.  This
>> is because try_to_unmap(), which (1) is called in the CMA allocation path,
>> and (2) decrements both _refcount and _mapcount for a page, will stop
>> unmapping a page from VMAs once the _mapcount for a page reaches 0.  This
>> implies that after try_to_unmap() has finished successfully for a page
>> where _recount > _mapcount, that _refcount will be greater than 0.  Later
>> in the CMA allocation path in migrate_page_move_mapping(), we will have one
>> more reference count than intended for anonymous pages, meaning the
>> allocation will fail for that page.
>>
>> One example of where _refcount can be greater than _mapcount for a page we
>> would not expect to be pinned is inside of copy_one_pte(), which is called
>> during a fork. For ptes for which pte_present(pte) == true, copy_one_pte()
>> will increment the _refcount field followed by the  _mapcount field of a
>> page. If the process doing copy_one_pte() is context switched out after
>> incrementing _refcount but before incrementing _mapcount, then the page
>> will be temporarily pinned.
>>
>> So, inside of cma_alloc(), instead of giving up when alloc_contig_range()
>> returns -EBUSY after having scanned a whole CMA-region bitmap, perform
>> retries indefinitely, with sleeps, to give the system an opportunity to
>> unpin any pinned pages.
>
> I am by no means an authoritative CMA person but this behavior does not seem acceptable, there is no doubt the existing one is sub-optimal under specific circumstances, but an indefinite retry, as well as a 100ms sleep appear to be arbitrary at best. How about you introduce a parameter that allows the tuning of the number of retries and/or delay between retries?

Also:

You should send your patch to linux-mm@xxxxxxxxx -- that's where
most memory management type patches are reviewed.

You should also send your patch to someone who could actually merge
it into the kernel source tree -- assuming that's what you want to happen.

Try scripts/get_mainttainer.pl on your patch to see what it says.

And if you are going to use a "cover letter" or "introductory email" before
the actual patch, the second (patch(es)) should be sent chained to the
first email. git send-email should do this for you.

--
~Randy