Re: Out-of-bounds access when hartid >= NR_CPUS
From: Heiko Stübner
Date: Tue Oct 26 2021 - 04:53:52 EST
Am Dienstag, 26. Oktober 2021, 08:44:31 CEST schrieb Geert Uytterhoeven:
> On Tue, Oct 26, 2021 at 2:37 AM Ron Economos <re@xxxxxxxx> wrote:
> > On 10/25/21 8:54 AM, Geert Uytterhoeven wrote:
> > > When booting a kernel with CONFIG_NR_CPUS=4 on Microchip PolarFire,
> > > the 4th CPU either fails to come online, or the system crashes.
> > >
> > > This happens because PolarFire has 5 CPU cores: hart 0 is an e51,
> > > and harts 1-4 are u54s, with the latter becoming CPUs 0-3 in Linux:
> > > - unused core has hartid 0 (sifive,e51),
> > > - processor 0 has hartid 1 (sifive,u74-mc),
> > > - processor 1 has hartid 2 (sifive,u74-mc),
> > > - processor 2 has hartid 3 (sifive,u74-mc),
> > > - processor 3 has hartid 4 (sifive,u74-mc).
> > >
> > > I assume the same issue is present on the SiFive fu540 and fu740
> > > SoCs, but I don't have access to these. The issue is not present
> > > on StarFive JH7100, as processor 0 has hartid 1, and processor 1 has
> > > hartid 0.
> > >
> > > arch/riscv/kernel/cpu_ops.c has:
> > >
> > > void *__cpu_up_stack_pointer[NR_CPUS] __section(".data");
> > > void *__cpu_up_task_pointer[NR_CPUS] __section(".data");
> > >
> > > void cpu_update_secondary_bootdata(unsigned int cpuid,
> > > struct task_struct *tidle)
> > > {
> > > int hartid = cpuid_to_hartid_map(cpuid);
> > >
> > > /* Make sure tidle is updated */
> > > smp_mb();
> > > WRITE_ONCE(__cpu_up_stack_pointer[hartid],
> > > task_stack_page(tidle) + THREAD_SIZE);
> > > WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
> > >
> > > The above two writes cause out-of-bound accesses beyond
> > > __cpu_up_{stack,pointer}_pointer[] if hartid >= CONFIG_NR_CPUS.
> > >
> > > }
> > >
> > > arch/riscv/kernel/smpboot.c:setup_smp(void) detects CPUs like this:
> > >
> > > for_each_of_cpu_node(dn) {
> > > hart = riscv_of_processor_hartid(dn);
> > > if (hart < 0)
> > > continue;
> > >
> > > if (hart == cpuid_to_hartid_map(0)) {
> > > BUG_ON(found_boot_cpu);
> > > found_boot_cpu = 1;
> > > early_map_cpu_to_node(0, of_node_to_nid(dn));
> > > continue;
> > > }
> > > if (cpuid >= NR_CPUS) {
> > > pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
> > > cpuid, hart);
> > > break;
> > > }
> > >
> > > cpuid_to_hartid_map(cpuid) = hart;
> > > early_map_cpu_to_node(cpuid, of_node_to_nid(dn));
> > > cpuid++;
> > > }
> > >
> > > So cpuid >= CONFIG_NR_CPUS (too many CPU cores) is already rejected.
> > >
> > > How to fix this?
> > >
> > > We could skip hartids >= NR_CPUS, but that feels strange to me, as
> > > you need NR_CPUS to be larger (much larger if the first usable hartid
> > > is a large number) than the number of CPUs used.
> > The Ubuntu distro config for HiFive Unmatched set this to CONFIG_NR_CPUS=8.
>
> I know. Same for most defconfigs in Linux. But we do not tend to
> work around buffer overflows by changing config values. Besides,
> those configs will still experience the issue when run on e.g. an
> 8+1 core processor where the cores used by Linux have hartids 1-8.
>
> I noticed because I started with a starlight config with
> CONFIG_NR_CPUS=2 (which gave me only one core), changed that to
> CONFIG_NR_CPUS=4, and got a kernel that didn't boot at all (no output
> without earlycon).I know. Same for most defconfigs in Linux. But we
> do not tend to
> work around buffer overflows by changing config values. Besides,
> those configs will still experience the issue when run on e.g. an
> 8+1 core processor where the cores used by Linux have hartids 1-8.
>
> > > We could store the minimum hartid, and always subtract that when
> > > accessing __cpu_up_{stack,pointer}_pointer[] (also in
> > > arch/riscv/kernel/head.S), but that means unused cores cannot be in the
> > > middle of the hartid range.
> > >
> > > Are hartids guaranteed to be continuous? If not, we have no choice but
> > > to index __cpu_up_{stack,pointer}_pointer[] by cpuid instead, which
> > > needs a more expensive conversion in arch/riscv/kernel/head.S.
>
> https://riscv.org/wp-content/uploads/2017/05/riscv-privileged-v1.10.pdf
> says:
>
> Hart IDs might not necessarily be numbered contiguously in a
> multiprocessor system, but at least one hart must have a hart
> ID of zero.
>
> Which means indexing arrays by hart ID is a no-go?
Isn't that also similar on aarch64?
On a rk3399 you get 0-3 and 100-101 and with the paragraph above
something like this could very well exist on some riscv cpu too I guess.
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>