Re: [PATCH v2] mm: cma: indefinitely retry allocations in cma_alloc

From: Chris Goldsworthy
Date: Thu Sep 24 2020 - 01:13:24 EST


On 2020-09-17 10:54, Chris Goldsworthy wrote:
On 2020-09-15 00:53, David Hildenbrand wrote:
On 14.09.20 20:33, Chris Goldsworthy wrote:
On 2020-09-14 02:31, David Hildenbrand wrote:
On 11.09.20 21:17, Chris Goldsworthy wrote:

So, inside of cma_alloc(), instead of giving up when
alloc_contig_range()
returns -EBUSY after having scanned a whole CMA-region bitmap,
perform
retries indefinitely, with sleeps, to give the system an opportunity
to
unpin any pinned pages.

Signed-off-by: Chris Goldsworthy <cgoldswo@xxxxxxxxxxxxxx>
Co-developed-by: Vinayak Menon <vinmenon@xxxxxxxxxxxxxx>
Signed-off-by: Vinayak Menon <vinmenon@xxxxxxxxxxxxxx>
---
mm/cma.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index 7f415d7..90bb505 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -442,8 +443,28 @@ struct page *cma_alloc(struct cma *cma, size_t
count, unsigned int align,
bitmap_maxno, start, bitmap_count, mask,
offset);
if (bitmap_no >= bitmap_maxno) {
- mutex_unlock(&cma->lock);
- break;
+ if (ret == -EBUSY) {
+ mutex_unlock(&cma->lock);
+
+ /*
+ * Page may be momentarily pinned by some other
+ * process which has been scheduled out, e.g.
+ * in exit path, during unmap call, or process
+ * fork and so cannot be freed there. Sleep
+ * for 100ms and retry the allocation.
+ */
+ start = 0;
+ ret = -ENOMEM;
+ msleep(100);
+ continue;
+ } else {
+ /*
+ * ret == -ENOMEM - all bits in cma->bitmap are
+ * set, so we break accordingly.
+ */
+ mutex_unlock(&cma->lock);
+ break;
+ }
}
bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
/*


What about long-term pinnings? IIRC, that can happen easily e.g.,
with
vfio (and I remember there is a way via vmsplice).

Not convinced trying forever is a sane approach in the general case
...

V1:
[1] https://lkml.org/lkml/2020/8/5/1097
[2] https://lkml.org/lkml/2020/8/6/1040
[3] https://lkml.org/lkml/2020/8/11/893
[4] https://lkml.org/lkml/2020/8/21/1490
[5] https://lkml.org/lkml/2020/9/11/1072

We're fine with doing indefinite retries, on the grounds that if there
is some long-term pinning that occurs when alloc_contig_range returns
-EBUSY, that it should be debugged and fixed. Would it be possible to
make this infinite-retrying something that could be enabled or
disabled
by a defconfig option?

Two thoughts:

This means I strongly prefer something like [3] if feasible.

_Resending so that this ends up on LKML_

I can give [3] some further thought then. Also, I realized [3] will not
completely solve the problem, it just reduces the window in which
_refcount > _mapcount (as mentioned in earlier threads, we encountered
the pinning when a task in copy_one_pte() or in the exit_mmap() path
gets context switched out). If we were to try a sleeping-lock based
solution, do you think it would be permissible to add another lock to
struct page?

I have not been able to think of a clean way of introducing calls to preempt_disable() in exit_mmap(), which is the more problematic case. We would need to track state across multiple invocations of zap_pte_range() (which is called for each entry in a PMD when a process's memory is being unmapped), and would also need to extend this to tlb_finish_mmu(), which is called after all the process's memory has been unmapped: https://elixir.bootlin.com/linux/v5.8.10/source/mm/mmap.c#L3164. As a follow-up to this patch, I'm submitting a patch that re-introduces the GFP mask for cma_alloc, that will perform indefinite retires if __GFP_NOFAIL is passed to the function.

--
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project