On Thu, Oct 11, 2012 at 01:34:12PM -0400, Rik van Riel wrote:
That is indeed a future optimization I have suggested
in the past. Allocation of this struct could be deferred
until the first time knuma_scand unmaps pages from the
process to generate NUMA page faults.
I already tried this, and quickly noticed that for mm_autonuma we
can't, or we wouldn't have memory to queue the "mm" into knuma_scand
in the first place.
For task_autonuma we could, but then we wouldn't be able to inherit
the task_autonuma->task_autonuma_nid across clone/fork which kind of
makes sense to me (and it's done by default without knob at the
moment). It's actually more important for clone than for fork but it
might be good for fork too if it doesn't exec immediately.
Another option is to move task_autonuma_nid in the task_structure
(it's in the stack so it won't cost RAM). Then I probably can defer
the task_autonuma if I remove the child_inheritance knob.
In knuma_scand we don't have the task pointer, so task_autonuma would
need to be allocated in the NUMA page faults, the first time it fires.