Re: [PATCH] mm: softdirty: write protect PTEs created for read faults after VM_SOFTDIRTY cleared

From: Peter Feiner
Date: Thu Aug 21 2014 - 15:37:43 EST


On Thu, Aug 21, 2014 at 02:45:43AM +0300, Kirill A. Shutemov wrote:
> On Wed, Aug 20, 2014 at 05:46:22PM -0400, Peter Feiner wrote:
> It basically means VM_SOFTDIRTY require writenotify on the vma.
>
> What about patch below? Untested. And it seems it'll introduce bug similar
> to bug fixed by c9d0bf241451, *but* IIUC we have it already in mprotect()
> code path.
>
> I'll look more careful tomorrow.
>
> Not-signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index dfc791c42d64..67d509a15969 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -851,8 +851,9 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
> if (type == CLEAR_REFS_MAPPED && !vma->vm_file)
> continue;
> if (type == CLEAR_REFS_SOFT_DIRTY) {
> - if (vma->vm_flags & VM_SOFTDIRTY)
> - vma->vm_flags &= ~VM_SOFTDIRTY;
> + vma->vm_flags &= ~VM_SOFTDIRTY;
> + vma->vm_page_prot = vm_get_page_prot(
> + vma->vm_flags & ~VM_SHARED);
> }
> walk_page_range(vma->vm_start, vma->vm_end,
> &clear_refs_walk);
> --
> Kirill A. Shutemov

Thanks Kirill, I prefer your approach. I'll send a v2.

I believe you're right about c9d0bf241451. It seems like passing the old & new
pgprot through pgprot_modify would handle the problem. Furthermore, as you
suggest, mprotect_fixup should use pgprot_modify when it turns write
notification on. I think a patch like this is in order:

Not-signed-off-by: Peter Feiner <pfeiner@xxxxxxxxxx>

diff --git a/mm/mmap.c b/mm/mmap.c
index c1f2ea4..86f89a1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1611,18 +1611,15 @@ munmap_back:
}

if (vma_wants_writenotify(vma)) {
- pgprot_t pprot = vma->vm_page_prot;
-
/* Can vma->vm_page_prot have changed??
*
* Answer: Yes, drivers may have changed it in their
* f_op->mmap method.
*
- * Ensures that vmas marked as uncached stay that way.
+ * Ensures that vmas marked with special bits stay that way.
*/
- vma->vm_page_prot = vm_get_page_prot(vm_flags & ~VM_SHARED);
- if (pgprot_val(pprot) == pgprot_val(pgprot_noncached(pprot)))
- vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+ vma->vm_page_prot = pgprot_modify(vma->vm_page_prot,
+ vm_get_page_prot(vm_flags & ~VM_SHARED);
}

vma_link(mm, vma, prev, rb_link, rb_parent);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index c43d557..6826313 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -324,7 +324,8 @@ success:
vm_get_page_prot(newflags));

if (vma_wants_writenotify(vma)) {
- vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED);
+ vma->vm_page_prot = pgprot_modify(vma->vm_page_prot,
+ vm_get_page_prot(newflags & ~VM_SHARED));
dirty_accountable = 1;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/