On Mon, Jan 15, 2024 at 05:59:31PM +0800, Huang Shijie wrote:
After setting the right NUMA node for VMAP stack,I spot that x86 seems to have an implementation of early_cpu_to_node(); what's
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?id=75b5e0bf90bf
I found there are at least four places in the common code where
the cpu_to_node() is called before it is initialized:
0.) early_trace_init() in kernel/trace/trace.c
1.) sched_init() in kernel/sched/core.c
2.) init_sched_fair_class() in kernel/sched/fair.c
3.) workqueue_init_early() in kernel/workqueue.c
We cannot use early_cpu_to_node() for them, since early_cpu_to_node()
does not work for some ARCHs, such as x86, riscv, etc.
wrong with it?
So we have to implement the arm64 specific cpu_to_node().Surely those early uses of cpu_to_node() are equally broken on those other
architectures, so why should this be specific to arm64?
This patchI don't think this is the right approach. Regardlesss of anything else, we
0.) introduces the __cpu_to_node function pointer,
and exports it for kernel modules.
1.) defines a macro cpu_to_node to override the
generic percpu implementation of cpu_to_node.
2.) __cpu_to_node is initialized with early_cpu_to_node() before
numa_node is initialized.
3.) __cpu_to_node is set to arm64_cpu_to_node() when numa_node is ready.
arm64_cpu_to_node() is a clone of the generic cpu_to_node().
shouldn't have a solution that only fixes arm64.
Why can't we mandate an early_cpu_to_node(), and have the other architectures
implement that?
Why can't we change cpu_to_node() to automatically do the right thing?