On Sun, 2008-08-24 at 01:40 -0400, Oren Laadan wrote:This classification eventually simplifies both dump and restore. For+/* vma subtypes */Is this really necessary, or can we infer it from the contents of the
+enum {
+ CR_VMA_ANON = 1,
+ CR_VMA_FILE
+};
VMA?
instance, it decides whether a file name follows or not.
There will be more, later: CR_VMA_FILE_UNLINKED (mapped to an unlinked
file), CR_VMA_ANON_SHARED (shared anonymous), CR_VMA_ANON_SHARED_SKIP
(shared anonymous, had been sent before) and so on.
I still don't see there being a need to explicitly specify the
distinction. Why should a VMA mapping an unlinked file be any different
from a linked one? It should point over to the 'file' checkpoint
structure and let the real work be done there.
There are no truly anonymous shared areas. They anon ones are still
file-backed as far as the kernel is concerned. If we do the file saving
correctly, I think most of these problems just fall out.
struct cr_hdr_head {
__u64 magic;
This is almost identical to the original - see the preceding comment.+At best this needs to get folded back into its own function. This is
+ while (addr < end) {
+ struct page *page;
+
+ /* simplified version of get_user_pages(): already have vma,
+ * only need FOLL_TOUCH, and (for now) ignore fault stats */
+
+ cond_resched();
+ while (!(page = follow_page(vma, addr, FOLL_TOUCH))) {
+ ret = handle_mm_fault(vma->vm_mm, vma, addr, 0);
+ if (ret & VM_FAULT_ERROR) {
+ if (ret & VM_FAULT_OOM)
+ ret = -ENOMEM;
+ else if (ret & VM_FAULT_SIGBUS)
+ ret = -EFAULT;
+ else
+ BUG();
+ break;
+ }
+ cond_resched();
+ }
Exactly. The code is copy-and-pasted. If there's a bug in the
original, who will change this one? Better to simply consolidate the
code into one copy.
pretty hard to read as-is. Also, are there truly no in-kernel functionsCan you suggest one ?
that can be used for this?
This is where the mentality has to shift. Instead of thinking "there is
no exact in-kernel match for what I want here" we need to consider that
we can modify the in-kernel ones. They can be modified to suit both
existing and the new needs that we have.
It is my understanding that the code must not sleep between kmap_atomic()+ for (pgarr = ctx->pgarr; pgarr; pgarr = pgarr->next) {Why not use kmap_atomic() here?
+ struct page **pages = pgarr->pages;
+ int nr = pgarr->nused;
+ void *ptr;
+
+ while (nr--) {
+ ptr = kmap(*pages);
+ ret = cr_kwrite(ctx, ptr, PAGE_SIZE);
+ kunmap(*pages);
and kunmap_atomic().
Yes, but you're going to absolutely kill performance on systems which
have expensive global TLB flushes. Frequent kmaps()s should be avoided
at all costs.
The best way around this would be to change the interface to cr_kwrite()
so that it didn't have to use *mapped* kernel memory. Maybe an
interface that takes 'struct page'. Or, one where we kmap_atomic() the
buffer, kunmap_atomic(), copy to a temporary buffer, then cr_kwrite().
:) well, the usual EINVAL didn't seem suitable. Any better suggestions ?+static int cr_write_vma(struct cr_ctx *ctx, struct vm_area_struct *vma)Hmmm. Interesting error code for VM_HUGETLB. :)
+{
+ struct cr_hdr h;
+ struct cr_hdr_vma *hh = ctx->hbuf;
+ char *fname = NULL;
+ int flen = 0, how, nr, ret;
+
+ h.type = CR_HDR_VMA;
+ h.len = sizeof(*hh);
+ h.ptag = 0;
+
+ hh->vm_start = vma->vm_start;
+ hh->vm_end = vma->vm_end;
+ hh->vm_page_prot = vma->vm_page_prot.pgprot;
+ hh->vm_flags = vma->vm_flags;
+ hh->vm_pgoff = vma->vm_pgoff;
+
+ if (vma->vm_flags & (VM_SHARED | VM_IO | VM_HUGETLB | VM_NONLINEAR)) {
+ pr_warning("CR: unknown VMA %#lx\n", vma->vm_flags);
+ return -ETXTBSY;
+ }
-ENOSUP might be clearest for now. "Your system call tried to do
something unsupported."