Re: [PATCH] driver core: Fix test_async_driver_probe if NUMA is disabled

From: Alexander Duyck
Date: Wed Nov 27 2019 - 18:24:58 EST


On Wed, 2019-11-27 at 14:42 -0800, Guenter Roeck wrote:
> On 11/27/19 1:24 PM, Alexander Duyck wrote:
> > On Wed, 2019-11-27 at 12:24 -0800, Guenter Roeck wrote:
> > > Since commit 57ea974fb871 ("driver core: Rewrite test_async_driver_probe
> > > to cover serialization and NUMA affinity"), running the test with NUMA
> > > disabled results in warning messages similar to the following.
> > >
> > > test_async_driver test_async_driver.12: NUMA node mismatch -1 != 0
> > >
> > > If CONFIG_NUMA=n, dev_to_node(dev) returns -1, and numa_node_id()
> > > returns 0. Both are widely used, so it appears risky to change return
> > > values. Augment the check with IS_ENABLED(CONFIG_NUMA) instead
> > > to fix the problem.
> > >
> > > Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
> > > Fixes: 57ea974fb871 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity")
> > > Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> > > ---
> > > drivers/base/test/test_async_driver_probe.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/base/test/test_async_driver_probe.c b/drivers/base/test/test_async_driver_probe.c
> > > index f4b1d8e54daf..3bb7beb127a9 100644
> > > --- a/drivers/base/test/test_async_driver_probe.c
> > > +++ b/drivers/base/test/test_async_driver_probe.c
> > > @@ -44,7 +44,8 @@ static int test_probe(struct platform_device *pdev)
> > > * performing an async init on that node.
> > > */
> > > if (dev->driver->probe_type == PROBE_PREFER_ASYNCHRONOUS) {
> > > - if (dev_to_node(dev) != numa_node_id()) {
> > > + if (IS_ENABLED(CONFIG_NUMA) &&
> > > + dev_to_node(dev) != numa_node_id()) {
> > > dev_warn(dev, "NUMA node mismatch %d != %d\n",
> > > dev_to_node(dev), numa_node_id());
> > > atomic_inc(&warnings);
> >
> > I'm not sure that is really the correct fix. It might be better to test it
> > against NUMA_NO_NODE and then if it is not that make sure that it matches
> > the node ID. Adding the check against NUMA_NO_NODE would resolve the issue
> > for cases where the device might be assigned to multiple NUMA nodes.
> >
> I think you are suggesting that dev_to_node(dev) might return NUMA_NO_NODE
> even on systems with CONFIG_NUMA enabled. I have no idea if that can happen.
> The code in test_async_probe_init() seems to suggest that the node is set
> to a valid node id for all asynchronous nodes, so I don't immediately see
> how that could be the case. I may be missing something, of course.

Well thinking back to the Nehalem architecture I seem to recall that there
were devices that were connected to a shared IOH that was accessible
across both nodes. I thought that they might have a node ID of
NUMA_NO_NODE since they didn't really belong to either of the two nodes in
the sytem.

It would effectively work out the same as your patch compiler wise since
dev_to_node would be NUMA_NO_NODE in the non-NUMA case so it would compile
out the warning since it would fail the first check, and in the NUMA case
it would add an extra check to make sure that dev_to_node is actually
indicating the device needs a specific node in the NUMA enabled case.