Re: [PATCH -V2] sched,topology: Update sched topology atomically

From: ying.huang@xxxxxxxxx
Date: Tue Apr 26 2022 - 02:44:07 EST


Hi, Valentin,

On Mon, 2022-04-25 at 16:52 +0100, Valentin Schneider wrote:
> On 21/04/22 08:31, Huang Ying wrote:
> > When Peter Zijlstra reviewed commit 0fb3978b0aac ("sched/numa: Fix
> > NUMA topology for systems with CPU-less nodes") [1], he pointed out
> > that sched_domains_numa_distance and sched_domains_numa_masks are made
> > separate RCU variables. That could go side-ways if there were a
> > function using both, although there isn't for now.
> >
> > So we update sched_domains_numa_distance and sched_domains_numa_masks
> > and some other related sched topology parameters atomically to address
> > the potential issues.
> >
> > [1] https://lkml.kernel.org/r/20220214121553.582248-1-ying.huang@xxxxxxxxx
> >
> > Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Mel Gorman <mgorman@xxxxxxx>
> > Cc: Rik van Riel <riel@xxxxxxxxxxx>
> > Cc: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
> >
> > Changelog:
> >
> > v2:
> >
> > - Addressed comments from Valentin Schneider, Thanks!
>
> One small bug and a whitespace nit below, with those fixed:
>
> Reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx>

Thanks! Your review comments help me a lot!

> FWIW I briefly tested this vs hotplug on QEMU.
>
> > @@ -1806,8 +1873,7 @@ void sched_init_numa(int offline_node)
> >
> >                       if (distance < LOCAL_DISTANCE || distance >= NR_DISTANCE_VALUES) {
> >                               sched_numa_warn("Invalid distance value range");
> > - bitmap_free(distance_map);
> > - return;
> > + goto free_bitmap;
>
> The indentation here is wrong (spaces vs tabs).

Yes. Will fix in the next version.

> >                       }
> >
> >                       bitmap_set(distance_map, distance, 1);
>
> >       /* Compute default topology size */
> >       for (i = 0; sched_domain_topology[i].mask; i++);
>
> After the original boot this will now be the default topology with the NUMA
> bits on top, so we'll just keep growing the array every time we hotplug a
> node. This should use sched_domain_topology_default instead (ditto for the
> copy loop further down).
>

Yes. You are right! Thanks for pointing this out. Will fix this in
the next version.

Best Regards,
Huang, Ying