> > (4) Similarly to (1) I take it there's exactly one struct mm_struct per
> > struct task_struct, and each of the struct vm_area_struct
> > *mmap points to a chain of vma's unique to the task?
> No. Threads share mm structures. (See kernel/fork.c, copy_mm() where
> it checks CLONE_VM). mm->count is a reference count (see mmget() in
> <linux/shed.h>).
As long as threads share the whole of the mm stuff, page tables included,
I'm not worried there; as long as there's exactly one struct vm_area_struct
per VM area per heavyweight process, I'll be fine...
> > (5) When we start to swap a page out to disk, if the process wants
> > to write to that page, what happens? I can't find anything
> > to prevent the access, nor can I find anything that would
> > notice such an access, until the disk I/O completes and the
> > page gets replaced or hits the swap cache...
> Um, the code in mm/vmscan.c:try_to_swap_out sure looks like it clears the
> TLB entry before swapping out. get_swap_page returns a TLB entry for
> a not-present page, which is installed into the TLB and then the swapout
> is done.
Ah, there it is. Knew it had to be somewhere, was looking in ll_rw_page,
but never followed things all the way down the maze of twisty little
functions, all different. (Not as bad-looking as the maze of little
different functions, all twisty, that is slab.c, but OTOH I didn't miss
important stuff in slab.c that I know of, yet.)
> Another valid alternative just sets the page clean before the swap out,
> and when the I/O completes, if it was dirtied, I guess that wasn't
> a real good page to swap out...
That's what I'd do if I had to write it from scratch.
> After this, you get out of my depth. I know that Linus has been resisting
> reverse page maps for a while, since a linked list through all the TLBs
> showing all the users of a given page doubles the size of the TLBs and
> causes all kinds of second-order performance problems.
TLB? I know I'll need to be able to do a struct page -> pte(s) lookup,
which would of course include any ptes cached in soft TLBs, but I was
hoping I could manage this linkage at, say, the vma level--after all,
that's part of what they're for, no? (Obviously, they can't go in with
the struct pages...) This may take a little arithmetic per vma involved
in the sharing,
Keith