Re: [PATCH v2] cpu-topology: Skip the exist but not possible cpu nodes

From: Sudeep Holla
Date: Mon Jan 13 2020 - 05:19:27 EST


On Sat, Jan 11, 2020 at 02:53:40PM +0800, Zeng Tao wrote:
> When CONFIG_NR_CPUS is smaller than the cpu nodes defined in the device
> tree, all the cpu nodes parsing will fail.
> And this is not reasonable for a legal device tree configs.
> In this patch, skip such cpu nodes rather than return an error.
> With CONFIG_NR_CPUS = 128 and cpus nodes num in device tree is 130,
> The following warning messages will be print during boot:
> CPU node for /cpus/cpu@128 exist but the possible cpu range is :0-127
> CPU node for /cpus/cpu@129 exist but the possible cpu range is :0-127
> CPU node for /cpus/cpu@130 exist but the possible cpu range is :0-127
>
> Signed-off-by: Zeng Tao <prime.zeng@xxxxxxxxxxxxx>
> ---
> Changelog:
> v1->v2:
> -Remove redundant -ENODEV assignment in get_cpu_for_node
> -Add comment to describe the get_cpu_for_node return values
> -Add skip process for cpu threads
> -Update the commit log with more detail
> ---
> drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++--------
> 1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 5fe44b3..01f0e21 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -248,22 +248,44 @@ core_initcall(free_raw_capacity);
> #endif
>
> #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +/*
> + * This function returns the logic cpu number of the node.
> + * There are totally three kinds of return values:
> + * (1) logic cpu number which is > 0.
> + * (2) -ENDEV when the node is valid one which can be found in the device tree
> + * but there is no possible cpu nodes to match, when the CONFIG_NR_CPUS is
> + * smaller than cpus node numbers in device tree, this will happen. It's
> + * suggested to just ignore this case.

s/ENDEV/ENODEV/

Also as I mentioned earlier, I prefer not to add any extra logic here
other than the above comment to make it explicit. This triggers unnecessary
warnings when someone boots with limited CPUs for valid reasons.


> + * (3) -EINVAL when other errors occur.
> + */
> static int __init get_cpu_for_node(struct device_node *node)
> {
> - struct device_node *cpu_node;
> + struct device_node *cpu_node, *t;
> int cpu;
> + bool found = false;
>
> cpu_node = of_parse_phandle(node, "cpu", 0);
> if (!cpu_node)
> - return -1;
> + return -EINVAL;
> +
> + for_each_of_cpu_node(t)
> + if (t == cpu_node) {
> + found = true;
> + break;
> + }
> +
> + if (!found) {
> + pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
> + return -EINVAL;
> + }
>
> cpu = of_cpu_node_to_id(cpu_node);
> if (cpu >= 0)
> topology_parse_cpu_capacity(cpu_node, cpu);
> else
> - pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
> + pr_warn("CPU node for %pOF exist but the possible cpu range is :%*pbl\n",
> + cpu_node, cpumask_pr_args(cpu_possible_mask));
>
> - of_node_put(cpu_node);

Why is this dropped ?

--
Regards,
Sudeep