well, I need to study the page migration patch more (this is the
first time I've heard of it). But it sounds as if my patch and the
page migration patch are complementary.
On the other hand, there is some work to be done wrt memory policies
and page migration. For the project I am working on, we need to be able
to move all of the pages used by a process on one set of nodes to another
set of nodes. At some point during this process we will need to update
the memory policy for that process. For Steve's patch, we will
similarly need to update the policy associated with files associated with
the process, I would think, elsewise new pages will get allocated on the
old set of nodes, which is something we don't want. Sounds like some
new interfaces will have to be developed here. Does that make sense
to you, Andi and Steve?
yes.
My personal preference would be to keep as much of this as possible
under user space control; that is, rather than having a big autonomous
system call that migrates pages and then updates policy information,
I'd prefer to split the work into several smaller system calls that
are issued by a user space program responsible for coordinating the
process migration as a series of steps, e. g.:
(1) suspend the process via SIGSTOP
(2) update the mempolicy information
(3) migrate the process's pages
(4) migrate the process to the new cpu via set_schedaffinity()
(5) resume the process via SIGCONT
steps 2 and 3 can be accomplished by a call to mbind() and
specifying MPOL_MF_MOVE. And since mbind() takes an
address range, you could probably migrate pages and change
the policies for all of the process' mappings in a single mbind()
call.
Note that Andrew had to drop my patch from 2.6.10, because
the 4-level page tables feature was re-implemented using a
different interface, which broke my patch. So Andrew asked me
to re-do the patch for inclusion in 2.6.11. That gives us ~2 months
to work on integrating the page migration and NUMA mempolicy
filemap patches.
Ray, btw it is beneficial that I can work with you on this, because I have
no access to true NUMA machines. My testing of the filemap mempolicy
patch has only been on a UP discontiguous memory system. I assume
you've got access to Altix machines at SGI to do testing and benchmarking
of my filemap patch and your migration patches.
Steve