Re: [RFC] de-asmify the x86-64 system call slowpath

From: Linus Torvalds
Date: Wed Feb 05 2014 - 23:34:08 EST


On Wed, Feb 5, 2014 at 6:32 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> No, I was thinking "try to optimistically map 8 adjacent aligned pages
> at a time" - that would be the same cacheline in the page tables, so
> it would be fairly cheap if we couple it with a gang-lookup of the
> pages in the page cache

Doing the gang-lookup is hard, since it's all abstracted away, but the
attached patch kind of tries to do what I described.

This patch probably doesn't work, but something *like* this might be
worth playing with.

Except I suspect the page-backed faults are actually the minority,
judging by how high clear_page_c_e is in the profile it's probably
mostly anonymous memory. I have no idea why I started with the (more
complex) case of file-backed prefaulting. Oh well.

Linus
mm/memory.c | 42 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index be6a0c0d4ae0..d52ec6a344dc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3487,15 +3487,55 @@ uncharge_out:
return ret;
}

+#define no_fault_around(x) (x & (VM_FAULT_ERROR | VM_FAULT_MAJOR | VM_FAULT_RETRY))
+#define FAULT_AROUND_SHIFT (3)
+
+static void fault_around(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, pmd_t *pmd)
+{
+ int nr = 1 << FAULT_AROUND_SHIFT;
+
+ address &= PAGE_MASK << FAULT_AROUND_SHIFT;
+ if (address < vma->vm_start)
+ return;
+
+ do {
+ pte_t *pte;
+ pte_t entry;
+ pgoff_t pgoff;
+
+ pte = pte_offset_map(pmd, address);
+ entry = *pte;
+
+ pte_unmap(pte);
+ if (!pte_none(entry))
+ continue;
+ pgoff = (address - vma->vm_start) >> PAGE_SHIFT;
+ pgoff += vma->vm_pgoff;
+ if (no_fault_around(__do_fault(mm, vma, address, pmd, pgoff, 0, entry)))
+ break;
+ } while (address += PAGE_SIZE, address < vma->vm_end && --nr > 0);
+}
+
static int do_linear_fault(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pte_t *page_table, pmd_t *pmd,
unsigned int flags, pte_t orig_pte)
{
+ int ret;
pgoff_t pgoff = (((address & PAGE_MASK)
- vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;

pte_unmap(page_table);
- return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
+ ret = __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
+
+ /*
+ * If the page we were looking for succeeded with no retries,
+ * see if we can fault around it too..
+ */
+ if (!no_fault_around(ret) && (flags & FAULT_FLAG_ALLOW_RETRY))
+ fault_around(mm, vma, address, pmd);
+
+ return ret;
}

/*