Re: REGRESSION: Performance regressions from switching anon_vma->lockto mutex
From: Linus Torvalds
Date: Tue Jun 14 2011 - 23:44:32 EST
On Tue, Jun 14, 2011 at 6:21 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Anyway, please check me if I'm wrong, but won't the "anon_vma->root"
> be the same for all the anon_vma's that are associated with one
> particular vma?
>
> The reason I ask [...]
So here's a trial patch that moves the anon_vma locking one level up
in the anon_vma_clone() call chain. It actually does allow the root to
change, but has a WARN_ON_ONCE() if that ever happens.
I *suspect* this will help the locking numbers a bit, but I'd like to
note that it does this *only* for the anon_vma_clone() case, and the
exact same thing should be done for the exit case too (ie the
unlink_anon_vmas()). So if it does work it's still just one step on
the way, and there would be more work along the same lines to possibly
improve the locking further.
The patch is "tested" in the sense that I booted the kernel and am
running it right now (and compiled a kernel with it). But that's not a
whole lot of actual real life testing, so caveat emptor.
And I won't really even guarantee that the main problem locking-wise
would be a long chain of "same_vma" anon-vma's that this does with
just a single lock. So who knows - maybe it doesn't help at all. I
suspect it's worth testing, though.
Linus
mm/rmap.c | 18 ++++++++++++++++--
1 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463ea88dd..206c3fb072af 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -208,13 +208,11 @@ static void anon_vma_chain_link(struct vm_area_struct *vma,
avc->anon_vma = anon_vma;
list_add(&avc->same_vma, &vma->anon_vma_chain);
- anon_vma_lock(anon_vma);
/*
* It's critical to add new vmas to the tail of the anon_vma,
* see comment in huge_memory.c:__split_huge_page().
*/
list_add_tail(&avc->same_anon_vma, &anon_vma->head);
- anon_vma_unlock(anon_vma);
}
/*
@@ -224,16 +222,30 @@ static void anon_vma_chain_link(struct vm_area_struct *vma,
int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
{
struct anon_vma_chain *avc, *pavc;
+ struct anon_vma *root = NULL;
list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
+ struct anon_vma *anon_vma = pavc->anon_vma, *new_root = anon_vma->root;
+
+ if (new_root != root) {
+ if (WARN_ON_ONCE(root))
+ mutex_unlock(&root->mutex);
+ root = new_root;
+ mutex_lock(&root->mutex);
+ }
+
avc = anon_vma_chain_alloc();
if (!avc)
goto enomem_failure;
anon_vma_chain_link(dst, avc, pavc->anon_vma);
}
+ if (root)
+ mutex_unlock(&root->mutex);
return 0;
enomem_failure:
+ if (root)
+ mutex_unlock(&root->mutex);
unlink_anon_vmas(dst);
return -ENOMEM;
}
@@ -280,7 +292,9 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
get_anon_vma(anon_vma->root);
/* Mark this anon_vma as the one where our new (COWed) pages go. */
vma->anon_vma = anon_vma;
+ anon_vma_lock(anon_vma);
anon_vma_chain_link(vma, avc, anon_vma);
+ anon_vma_unlock(anon_vma);
return 0;