Re: [PATCH] driver core: Fix test_async_driver_probe if NUMA is disabled

From: Guenter Roeck
Date: Wed Nov 27 2019 - 19:33:29 EST


On 11/27/19 3:13 PM, Alexander Duyck wrote:
On Wed, 2019-11-27 at 14:42 -0800, Guenter Roeck wrote:
On 11/27/19 1:24 PM, Alexander Duyck wrote:
On Wed, 2019-11-27 at 12:24 -0800, Guenter Roeck wrote:
Since commit 57ea974fb871 ("driver core: Rewrite test_async_driver_probe
to cover serialization and NUMA affinity"), running the test with NUMA
disabled results in warning messages similar to the following.

test_async_driver test_async_driver.12: NUMA node mismatch -1 != 0

If CONFIG_NUMA=n, dev_to_node(dev) returns -1, and numa_node_id()
returns 0. Both are widely used, so it appears risky to change return
values. Augment the check with IS_ENABLED(CONFIG_NUMA) instead
to fix the problem.

Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
Fixes: 57ea974fb871 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity")
Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
---
drivers/base/test/test_async_driver_probe.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/base/test/test_async_driver_probe.c b/drivers/base/test/test_async_driver_probe.c
index f4b1d8e54daf..3bb7beb127a9 100644
--- a/drivers/base/test/test_async_driver_probe.c
+++ b/drivers/base/test/test_async_driver_probe.c
@@ -44,7 +44,8 @@ static int test_probe(struct platform_device *pdev)
* performing an async init on that node.
*/
if (dev->driver->probe_type == PROBE_PREFER_ASYNCHRONOUS) {
- if (dev_to_node(dev) != numa_node_id()) {
+ if (IS_ENABLED(CONFIG_NUMA) &&
+ dev_to_node(dev) != numa_node_id()) {
dev_warn(dev, "NUMA node mismatch %d != %d\n",
dev_to_node(dev), numa_node_id());
atomic_inc(&warnings);

I'm not sure that is really the correct fix. It might be better to test it
against NUMA_NO_NODE and then if it is not that make sure that it matches
the node ID. Adding the check against NUMA_NO_NODE would resolve the issue
for cases where the device might be assigned to multiple NUMA nodes.

I think you are suggesting that dev_to_node(dev) might return NUMA_NO_NODE
even on systems with CONFIG_NUMA enabled. I have no idea if that can happen.
The code in test_async_probe_init() seems to suggest that the node is set
to a valid node id for all asynchronous nodes, so I don't immediately see
how that could be the case. I may be missing something, of course.

Well thinking back to the Nehalem architecture I seem to recall that there
were devices that were connected to a shared IOH that was accessible
across both nodes. I thought that they might have a node ID of
NUMA_NO_NODE since they didn't really belong to either of the two nodes in
the sytem.

It would effectively work out the same as your patch compiler wise since
dev_to_node would be NUMA_NO_NODE in the non-NUMA case so it would compile
out the warning since it would fail the first check, and in the NUMA case
it would add an extra check to make sure that dev_to_node is actually
indicating the device needs a specific node in the NUMA enabled case.


I thought the code is specifically checking devices which it previously
created, which are well defined and understood test devices. After all,
the check is in the test driver's probe function. Guess I really don't
understand the code. Please take my patch as bug report, and submit
whatever fix you think is correct.

Thanks,
Guenter