Re: [PATCH] driver core: ensure a device has valid node id in device_add()

From: Michal Hocko
Date: Tue Sep 10 2019 - 08:53:45 EST


On Tue 10-09-19 20:47:40, Yunsheng Lin wrote:
> On 2019/9/10 19:12, Greg KH wrote:
> > On Tue, Sep 10, 2019 at 01:04:51PM +0200, Michal Hocko wrote:
> >> On Tue 10-09-19 18:58:05, Yunsheng Lin wrote:
> >>> On 2019/9/10 17:31, Greg KH wrote:
> >>>> On Tue, Sep 10, 2019 at 02:43:32PM +0800, Yunsheng Lin wrote:
> >>>>> On 2019/9/9 17:53, Greg KH wrote:
> >>>>>> On Mon, Sep 09, 2019 at 02:04:23PM +0800, Yunsheng Lin wrote:
> >>>>>>> Currently a device does not belong to any of the numa nodes
> >>>>>>> (dev->numa_node is NUMA_NO_NODE) when the node id is neither
> >>>>>>> specified by fw nor by virtual device layer and the device has
> >>>>>>> no parent device.
> >>>>>>
> >>>>>> Is this really a problem?
> >>>>>
> >>>>> Not really.
> >>>>> Someone need to guess the node id when it is not specified, right?
> >>>>
> >>>> No, why? Guessing guarantees you will get it wrong on some systems.
> >>>>
> >>>> Are you seeing real problems because the id is not being set? What
> >>>> problem is this fixing that you can actually observe?
> >>>
> >>> When passing the return value of dev_to_node() to cpumask_of_node()
> >>> without checking the node id if the node id is not valid, there is
> >>> global-out-of-bounds detected by KASAN as below:
> >>
> >> OK, I seem to remember this being brought up already. And now when I
> >> think about it, we really want to make cpumask_of_node NUMA_NO_NODE
> >> aware. That means using the same trick the allocator does for this
> >> special case.
> >
> > That seems reasonable to me, and much more "obvious" as to what is going
> > on.
> >
>
> Ok, thanks for the suggestion.
>
> For arm64 and x86, there are two versions of cpumask_of_node().
>
> when CONFIG_DEBUG_PER_CPU_MAPS is defined, the cpumask_of_node()
> in arch/x86/mm/numa.c is used, which does partial node id checking:
>
> const struct cpumask *cpumask_of_node(int node)
> {
> if (node >= nr_node_ids) {
> printk(KERN_WARNING
> "cpumask_of_node(%d): node > nr_node_ids(%u)\n",
> node, nr_node_ids);
> dump_stack();
> return cpu_none_mask;
> }
> if (node_to_cpumask_map[node] == NULL) {
> printk(KERN_WARNING
> "cpumask_of_node(%d): no node_to_cpumask_map!\n",
> node);
> dump_stack();
> return cpu_online_mask;
> }
> return node_to_cpumask_map[node];
> }
>
> when CONFIG_DEBUG_PER_CPU_MAPS is undefined, the cpumask_of_node()
> in arch/x86/include/asm/topology.h is used:
>
> static inline const struct cpumask *cpumask_of_node(int node)
> {
> return node_to_cpumask_map[node];
> }

I would simply go with. There shouldn't be any need for heavy weight
checks that CONFIG_DEBUG_PER_CPU_MAPS has.

static inline const struct cpumask *cpumask_of_node(int node)
{
/* A nice comment goes here */
if (node == NUMA_NO_NODE)
return node_to_cpumask_map[numa_mem_id()];
return node_to_cpumask_map[node];
}
--
Michal Hocko
SUSE Labs