On 2020-09-15 00:53, David Hildenbrand wrote:
On 14.09.20 20:33, Chris Goldsworthy wrote:
On 2020-09-14 02:31, David Hildenbrand wrote:
On 11.09.20 21:17, Chris Goldsworthy wrote:
So, inside of cma_alloc(), instead of giving up when
alloc_contig_range()
returns -EBUSY after having scanned a whole CMA-region bitmap,
perform
retries indefinitely, with sleeps, to give the system an opportunity
to
unpin any pinned pages.
Signed-off-by: Chris Goldsworthy <cgoldswo@xxxxxxxxxxxxxx>
Co-developed-by: Vinayak Menon <vinmenon@xxxxxxxxxxxxxx>
Signed-off-by: Vinayak Menon <vinmenon@xxxxxxxxxxxxxx>
---
mm/cma.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/mm/cma.c b/mm/cma.c
index 7f415d7..90bb505 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -442,8 +443,28 @@ struct page *cma_alloc(struct cma *cma, size_t
count, unsigned int align,
bitmap_maxno, start, bitmap_count, mask,
offset);
if (bitmap_no >= bitmap_maxno) {
- mutex_unlock(&cma->lock);
- break;
+ if (ret == -EBUSY) {
+ mutex_unlock(&cma->lock);
+
+ /*
+ * Page may be momentarily pinned by some other
+ * process which has been scheduled out, e.g.
+ * in exit path, during unmap call, or process
+ * fork and so cannot be freed there. Sleep
+ * for 100ms and retry the allocation.
+ */
+ start = 0;
+ ret = -ENOMEM;
+ msleep(100);
+ continue;
+ } else {
+ /*
+ * ret == -ENOMEM - all bits in cma->bitmap are
+ * set, so we break accordingly.
+ */
+ mutex_unlock(&cma->lock);
+ break;
+ }
}
bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
/*
What about long-term pinnings? IIRC, that can happen easily e.g.,
with
vfio (and I remember there is a way via vmsplice).
Not convinced trying forever is a sane approach in the general case
...
V1:
[1] https://lkml.org/lkml/2020/8/5/1097
[2] https://lkml.org/lkml/2020/8/6/1040
[3] https://lkml.org/lkml/2020/8/11/893
[4] https://lkml.org/lkml/2020/8/21/1490
[5] https://lkml.org/lkml/2020/9/11/1072
We're fine with doing indefinite retries, on the grounds that if there
is some long-term pinning that occurs when alloc_contig_range returns
-EBUSY, that it should be debugged and fixed. Would it be possible to
make this infinite-retrying something that could be enabled or
disabled
by a defconfig option?
Two thoughts:
This means I strongly prefer something like [3] if feasible.
_Resending so that this ends up on LKML_
I can give [3] some further thought then. Also, I realized [3] will not
completely solve the problem, it just reduces the window in which
_refcount > _mapcount (as mentioned in earlier threads, we encountered
the pinning when a task in copy_one_pte() or in the exit_mmap() path
gets context switched out). If we were to try a sleeping-lock based
solution, do you think it would be permissible to add another lock to
struct page?