RE: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set for the deduplicating sort

From: Song Bao Hua (Barry Song)
Date: Mon Feb 01 2021 - 05:36:50 EST




> -----Original Message-----
> From: Dietmar Eggemann [mailto:dietmar.eggemann@xxxxxxx]
> Sent: Monday, February 1, 2021 10:54 PM
> To: Valentin Schneider <valentin.schneider@xxxxxxx>;
> linux-kernel@xxxxxxxxxxxxxxx
> Cc: mingo@xxxxxxxxxx; peterz@xxxxxxxxxxxxx; vincent.guittot@xxxxxxxxxx;
> morten.rasmussen@xxxxxxx; mgorman@xxxxxxx; Song Bao Hua (Barry Song)
> <song.bao.hua@xxxxxxxxxxxxx>
> Subject: Re: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set for
> the deduplicating sort
>
> On 22/01/2021 13:39, Valentin Schneider wrote:
>
> [...]
>
> > @@ -1705,7 +1702,7 @@ void sched_init_numa(void)
> > /* Compute default topology size */
> > for (i = 0; sched_domain_topology[i].mask; i++);
> >
> > - tl = kzalloc((i + level + 1) *
> > + tl = kzalloc((i + nr_levels) *
> > sizeof(struct sched_domain_topology_level), GFP_KERNEL);
> > if (!tl)
> > return;
>
> This hunk creates issues during startup on my Arm64 juno board on tip/sched/core.

I also reported this kernel panic here:
https://lore.kernel.org/lkml/bfb703294b234e1e926a68fcb73dbee3@xxxxxxxxxxxxx/#t

>
> ---8<---
>
> From: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> Date: Mon, 1 Feb 2021 09:58:04 +0100
> Subject: [PATCH] sched/topology: Fix sched_domain_topology_level alloc in
> sched_init_numa
>
> Commit "sched/topology: Make sched_init_numa() use a set for the
> deduplicating sort" allocates 'i + nr_levels (level)' instead of
> 'i + nr_levels + 1' sched_domain_topology_level.
>
> This led to an Oops (on Arm64 juno with CONFIG_SCHED_DEBUG):
>
> sched_init_domains
> build_sched_domains()
> __free_domain_allocs()
> __sdt_free() {
> ...
> for_each_sd_topology(tl)
> ...
> sd = *per_cpu_ptr(sdd->sd, j); <--
> ...
> }
>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> ---

This patch also resolved my panic. So:

Tested-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>

Thanks
Barry