Re: [RFC v4][PATCH 4/9] Memory management (dump)

From: Dave Hansen
Date: Wed Sep 10 2008 - 12:55:57 EST


On Tue, 2008-09-09 at 03:42 -0400, Oren Laadan wrote:
> + while (addr < end) {
> + struct page *page;
> +
> + /*
> + * simplified version of get_user_pages(): already have vma,
> + * only need FOLL_TOUCH, and (for now) ignore fault stats.
> + *
> + * FIXME: consolidate with get_user_pages()
> + */
> +
> + cond_resched();
> + while (!(page = follow_page(vma, addr, FOLL_TOUCH))) {
> + ret = handle_mm_fault(vma->vm_mm, vma, addr, 0);
> + if (ret & VM_FAULT_ERROR) {
> + if (ret & VM_FAULT_OOM)
> + ret = -ENOMEM;
> + else if (ret & VM_FAULT_SIGBUS)
> + ret = -EFAULT;
> + else
> + BUG();
> + break;
> + }
> + cond_resched();
> + ret = 0;
> + }

get_user_pages() is really the wrong thing to use here. It makes pages
*present* so that we can do things like hand them off to a driver. For
checkpointing, we really don't care about that. It's a waste of time,
for instance to perform faults to fill the mappings up with zero pages
and page tables. Just think of what will happen the first time we touch
a very large, very sparse anonymous area. We'll probably kill the
system just allocating page tables. Take a look at the comment in
follow_page(). This is a similar operation to core dumping, and we need
to be careful.

This might be fine for a proof of concept, but it needs to be thought
out much more thoroughly before getting merged. I guess I'm
volunteering to go do that.

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/