On 2015/10/9 4:20, Andrew Morton wrote:
On Wed, 19 Aug 2015 17:18:15 -0700 (PDT) David Rientjes <rientjes@xxxxxxxxxx> wrote:Hi Andrew,
On Wed, 19 Aug 2015, Patil, Kiran wrote:
Acked-by: Kiran Patil <kiran.patil@xxxxxxxxx>
Where's the call to preempt_disable() to prevent kernels with preemption
from making numa_node_id() invalid during this iteration?
David asked this question twice, received no answer and now the patch
is in the maintainer tree, destined for mainline.
If I was asked this question I would respond
The use of numa_mem_id() is racy and best-effort. If the unlikely
race occurs, the memory allocation will occur on the wrong node, the
overall result being very slightly suboptimal performance. The
existing use of numa_node_id() suffers from the same issue.
But I'm not the person proposing the patch. Please don't just ignore
reviewer comments!
Apologize for the slow response due to personal reasons!
And thanks for answering the question from David. To be honest,
I didn't know how to answer this question before. Actually this
question has puzzled me for a long time when dealing with memory
hot-removal. For normal cases, it only causes sub-optimal memory
allocation if schedule event happens between querying NUMA node id
and calling alloc_pages_node(). But what happens if system run into
following execution sequence?
1) node = numa_mem_id();
2) memory hot-removal event triggers
2.1) remove affected memory
2.2) reset pgdat to zero if node becomes empty after memory removal
3) alloc_pages_node(), which may access zero-ed pgdat structure.
I haven't found a mechanism to protect system from above sequence yet,
so puzzled for a long time already:(. Does stop_machine() protect
system from such a execution sequence?