Re: [PATCH v3] mm: cma: indefinitely retry allocations in cma_alloc

From: Chris Goldsworthy
Date: Mon Sep 28 2020 - 16:11:57 EST


On 2020-09-27 12:23, Minchan Kim wrote:
On Wed, Sep 23, 2020 at 10:16:25PM -0700, Chris Goldsworthy wrote:
CMA allocations will fail if 'pinned' pages are in a CMA area, since we



+config CMA_RETRY_SLEEP_DURATION
+ int "Sleep duration between retries"
+ depends on CMA
+ default 100
+ help
+ Time to sleep for in milliseconds between the indefinite
+ retries of a CMA allocation that are induced by passing
+ __GFP_NOFAIL to cma_alloc().
+
+ If unsure, leave the value as "100".

What's the point of this new config? If we need it, How could admin
set their value?
How does it make bad if we just use simple default vaule?
IOW, I'd like to avoid introducing new config if we don't see good
justifcation.

I thought that it would be useful for developers, but I guess it would be much better for this to be runtime configurable. But, I don't think there's a strong reason to be able to configure the value - 100 ms has worked for us. I'll update scrap this option in a follow-up patch, and will use 100 ms as the sleeping time.

+
config MEM_SOFT_DIRTY
bool "Track memory changes"
depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
diff --git a/mm/cma.c b/mm/cma.c
index 7f415d7..4fbad2b 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -32,6 +32,7 @@
#include <linux/highmem.h>
#include <linux/io.h>
#include <linux/kmemleak.h>
+#include <linux/delay.h>
#include <trace/events/cma.h>

#include "cma.h"
@@ -403,13 +404,15 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
* @cma: Contiguous memory region for which the allocation is performed.
* @count: Requested number of pages.
* @align: Requested alignment of pages (in PAGE_SIZE order).
- * @no_warn: Avoid printing message about failed allocation
+ * @gfp_mask: If __GFP_NOWARN is passed, suppress messages about failed
+ * allocations. If __GFP_NOFAIL is passed, try doing the CMA
+ * allocation indefinitely until the allocation succeeds.
*
* This function allocates part of contiguous memory on specific
* contiguous memory area.
*/
struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
- bool no_warn)
+ gfp_t gfp_mask)
{
unsigned long mask, offset;
unsigned long pfn = -1;
@@ -442,8 +445,28 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
bitmap_maxno, start, bitmap_count, mask,
offset);
if (bitmap_no >= bitmap_maxno) {
- mutex_unlock(&cma->lock);
- break;
+ if (ret == -EBUSY && gfp_mask & __GFP_NOFAIL) {
+ mutex_unlock(&cma->lock);
+
+ /*
+ * Page may be momentarily pinned by some other
+ * process which has been scheduled out, e.g.
+ * in exit path, during unmap call, or process
+ * fork and so cannot be freed there. Sleep
+ * for 100ms and retry the allocation.
+ */
+ start = 0;
+ ret = -ENOMEM;
+ msleep(CONFIG_CMA_RETRY_SLEEP_DURATION);

Should it be uninterruptible, really?

Good point - I'll replace the msleep() this with schedule_timeout_killable() in the follow-up patch.

--
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project