Re: [RFC Patch V1 22/30] mm, of: Use cpu_to_mem()/numa_mem_id() to support memoryless node
From: Nishanth Aravamudan
Date: Mon Jul 28 2014 - 15:26:30 EST
On 28.07.2014 [07:30:40 -0600], Grant Likely wrote:
> On Mon, 21 Jul 2014 10:52:41 -0700, Nishanth Aravamudan <nacc@xxxxxxxxxxxxxxxxxx> wrote:
> > On 11.07.2014 [15:37:39 +0800], Jiang Liu wrote:
> > > When CONFIG_HAVE_MEMORYLESS_NODES is enabled, cpu_to_node()/numa_node_id()
> > > may return a node without memory, and later cause system failure/panic
> > > when calling kmalloc_node() and friends with returned node id.
> > > So use cpu_to_mem()/numa_mem_id() instead to get the nearest node with
> > > memory for the/current cpu.
> > >
> > > If CONFIG_HAVE_MEMORYLESS_NODES is disabled, cpu_to_mem()/numa_mem_id()
> > > is the same as cpu_to_node()/numa_node_id().
> > >
> > > Signed-off-by: Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx>
> > > ---
> > > drivers/of/base.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/of/base.c b/drivers/of/base.c
> > > index b9864806e9b8..40d4772973ad 100644
> > > --- a/drivers/of/base.c
> > > +++ b/drivers/of/base.c
> > > @@ -85,7 +85,7 @@ EXPORT_SYMBOL(of_n_size_cells);
> > > #ifdef CONFIG_NUMA
> > > int __weak of_node_to_nid(struct device_node *np)
> > > {
> > > - return numa_node_id();
> > > + return numa_mem_id();
> > > }
> > > #endif
> >
> > Um, NAK. of_node_to_nid() returns the NUMA node ID for a given device
> > tree node. The default should be the physically local NUMA node, not the
> > nearest memory-containing node.
>
> That description doesn't match the code. This patch only changes the
> default implementation of of_node_to_nid() which doesn't take the device
> node into account *at all* when returning a node ID. Just look at the
> diff.
I meant that of_node_to_nid() seems to be used throughout the call-sites
to indicate caller locality. We want to keep using cpu_to_node() there,
and fallback appropriately in the MM (when allocations occur offnode due
to memoryless nodes), not indicate memory-specific topology the caller
itself. There was a long thread between between Tejun and I that
discussed what we are trying for: https://lkml.org/lkml/2014/7/18/278
I understand that the code unconditionally returns current's NUMA node
ID right now (ignoring the device node). That seems correct, to me, for
something like:
of_device_add:
/* device_add will assume that this device is on the same node as
* the parent. If there is no parent defined, set the node
* explicitly */
if (!ofdev->dev.parent)
set_dev_node(&ofdev->dev, of_node_to_nid(ofdev->dev.of_node));
I don't think we want the default implementation to set the NUMA node of
a dev to the nearest NUMA node with memory?
> I think this patch is correct, and it doesn't affect the override
> versions provided by powerpc and sparc.
Yes, agreed, so maybe it doesn't matter. I guess my point was simply
that it only seems reasonable to change callers of cpu_to_node() to
cpu_to_mem() that aren't in the core MM is if they care about memoryless
nodes explicitly. I don't think the OF code does, so I don't think it
should change.
Sorry for my premature NAK and lack of clarity in my explanation.
-Nish
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/