Use the newly added unpin_user_page_range_dirty_lock()
for more quickly unpinning a consecutive range of pages
represented as compound pages. This will also calculate
number of pages to unpin (for the tail pages which matching
head page) and thus batch the refcount update.
Running a test program which calls mr reg/unreg on a 1G in size
and measures cost of both operations together (in a guest using rxe)
with THP and hugetlbfs:
Before:
590 rounds in 5.003 sec: 8480.335 usec / round
6898 rounds in 60.001 sec: 8698.367 usec / round
After:
2631 rounds in 5.001 sec: 1900.618 usec / round
31625 rounds in 60.001 sec: 1897.267 usec / round
Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx>
---
drivers/infiniband/core/umem.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 2dde99a9ba07..ea4ebb3261d9 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -47,17 +47,17 @@
static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty)
{
- struct sg_page_iter sg_iter;
- struct page *page;
+ bool make_dirty = umem->writable && dirty;
+ struct scatterlist *sg;
+ int i;
if (umem->nmap > 0)
ib_dma_unmap_sg(dev, umem->sg_head.sgl, umem->sg_nents,
DMA_BIDIRECTIONAL);
- for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
- page = sg_page_iter_page(&sg_iter);
- unpin_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
- }
+ for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i)
+ unpin_user_page_range_dirty_lock(sg_page(sg),
+ DIV_ROUND_UP(sg->length, PAGE_SIZE), make_dirty);
sg_free_table(&umem->sg_head);
}