Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the anon_vmasof a mergeable VMA

From: Rik van Riel
Date: Wed Apr 07 2010 - 11:54:14 EST


On 04/07/2010 11:30 AM, Linus Torvalds wrote:

I've said this now _three_ times, but let me repeat once more:

- the locking rules for that anon_vma_chain are very unclear. I _think_
you mean for them to be "mmap_sem held for writing, _or_ mmap_sem held
for reading and page_table_lock held", but nowhere is that actually
documented.

Why is it so hard for you to just admit that? Especially after you
yourself got it wrong.

You are right, the idea was to continue use the locking that
the anon_vma code was already using, without introducing any
new locking with the anon_vma patches.

However, it has become clear that this is no longer possible,
due to the need to hold a secondary lock across anon_vma_clone,
when we come from a code path that holds the mmap_sem for read.

+ merge_vma = find_mergeable_anon_vma(vma);
+ if (merge_vma) {
+ int ret;
+ spin_lock(&mm->page_table_lock);
+ ret = anon_vma_clone(vma, merge_vma);
+ if (!ret)
+ vma->anon_vma = merge_vma->anon_vma;
+ spin_unlock(&mm->page_table_lock);
+ return ret;
+ }

Rik, the above is obviously total crap.

anon_vma_clone() needs to allocate memory, and it does so with GFP_KERNEL.
You can't do that with a spinlock held.

Looks like we'll either have to introduce a per-mm semaphore for
the same_vma anon_vma chains, or move the complexity of solving
this bug to anon_vma_merge, where we can ensure that the resulting
VMA has the sum of the anon_vmas of each VMA.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/