Re: [PATCH v7] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

From: Peter Zijlstra
Date: Wed Oct 30 2019 - 06:28:58 EST


On Wed, Oct 30, 2019 at 11:22:29AM +0100, Michal Hocko wrote:
> On Wed 30-10-19 11:14:49, Peter Zijlstra wrote:
> > On Wed, Oct 30, 2019 at 05:34:28PM +0800, Yunsheng Lin wrote:
> > > When passing the return value of dev_to_node() to cpumask_of_node()
> > > without checking if the device's node id is NUMA_NO_NODE, there is
> > > global-out-of-bounds detected by KASAN.
> > >
> > > From the discussion [1], NUMA_NO_NODE really means no node affinity,
> > > which also means all cpus should be usable. So the cpumask_of_node()
> > > should always return all cpus online when user passes the node id as
> > > NUMA_NO_NODE, just like similar semantic that page allocator handles
> > > NUMA_NO_NODE.
> > >
> > > But we cannot really copy the page allocator logic. Simply because the
> > > page allocator doesn't enforce the near node affinity. It just picks it
> > > up as a preferred node but then it is free to fallback to any other numa
> > > node. This is not the case here and node_to_cpumask_map will only restrict
> > > to the particular node's cpus which would have really non deterministic
> > > behavior depending on where the code is executed. So in fact we really
> > > want to return cpu_online_mask for NUMA_NO_NODE.
> > >
> > > Also there is a debugging version of node_to_cpumask_map() for x86 and
> > > arm64, which is only used when CONFIG_DEBUG_PER_CPU_MAPS is defined, this
> > > patch changes it to handle NUMA_NO_NODE as normal node_to_cpumask_map().
> > >
> > > [1] https://lkml.org/lkml/2019/9/11/66
> > > Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> > > Suggested-by: Michal Hocko <mhocko@xxxxxxxxxx>
> > > Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> > > Acked-by: Paul Burton <paul.burton@xxxxxxxx> # MIPS bits
> >
> > Still:
> >
> > Nacked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>
> Do you have any other proposal that doesn't make any wild guesses about
> which node to use instead of the undefined one?

It only makes 'wild' guesses when the BIOS is shit and it complains
about that.

Or do you like you BIOS broken?