Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware
From: Peter Zijlstra
Date: Tue Sep 24 2019 - 09:00:25 EST
On Tue, Sep 24, 2019 at 02:43:25PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 24, 2019 at 02:25:00PM +0200, Michal Hocko wrote:
> > On Tue 24-09-19 14:09:43, Peter Zijlstra wrote:
>
> > > We can push back and say we don't respect the specification because it
> > > is batshit insane ;-)
> >
> > Here is my fingers crossed.
> >
> > [...]
> >
> > > Now granted; there's a number of virtual devices that really don't have
> > > a node affinity, but then, those are not hurt by forcing them onto a
> > > random node, they really don't do anything. Like:
> >
> > Do you really consider a random node a better fix than simply living
> > with a more robust NUMA_NO_NODE which tells the actual state? Page
> > allocator would effectivelly use the local node in that case. Any code
> > using the cpumask will know that any of the online cpus are usable.
>
> For the pmu devices? Yes, those 'devices' aren't actually used for
> anything other than sysfs entries.
>
> Nothing else uses the struct device.
The below would get rid of the PMU and workqueue warnings with no
side-effects (the device isn't used for anything except sysfs).
I'm stuck in the device code for BDIs, I can't find a sane place to set
the node before it gets added, due to it using device_create_vargs().
---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4f08b17d6426..2a64dcc3d70f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9965,6 +9965,7 @@ static int pmu_dev_alloc(struct pmu *pmu)
if (!pmu->dev)
goto out;
+ set_dev_node(pmu->dev, 0);
pmu->dev->groups = pmu->attr_groups;
device_initialize(pmu->dev);
ret = dev_set_name(pmu->dev, "%s", pmu->name);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index bc2e09a8ea61..efafc4590bbe 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5613,6 +5613,7 @@ int workqueue_sysfs_register(struct workqueue_struct *wq)
wq_dev->dev.bus = &wq_subsys;
wq_dev->dev.release = wq_device_release;
dev_set_name(&wq_dev->dev, "%s", wq->name);
+ set_dev_node(wq_dev, 0);
/*
* unbound_attrs are created separately. Suppress uevent until