Re: [RFC PATCH 03/13] sched: Core-wide rq->lock

From: Peter Zijlstra
Date: Wed Apr 15 2020 - 06:55:58 EST


On Tue, Apr 14, 2020 at 05:35:07PM -0400, Vineeth Remanan Pillai wrote:
> > Aside from the fact that it's probably much saner to write this as:
> >
> > rq->core_enabled = static_key_enabled(&__sched_core_enabled);
> >
> > I'm fairly sure I didn't write this part. And while I do somewhat see
> > the point of disabling core scheduling for a core that has only a single
> > thread on, I wonder why we care.
> >
> I think this change was to fix some crashes which happened due to
> uninitialized rq->core if a sibling was offline during boot and is
> onlined after coresched was enabled.
>
> https://lwn.net/ml/linux-kernel/20190424111913.1386-1-vpillai@xxxxxxxxxxxxxxxx/
>
> I tried to fix it by initializing coresched members during a cpu online
> and tearing it down on a cpu offline. This was back in v3 and do not
> remember the exact details. I shall revisit this and see if there is a
> better way to fix the race condition above.

Argh, that problem again. So AFAIK booting with maxcpus= is broken in a
whole number of 'interesting' ways. I'm not sure what to do about that,
perhaps we should add a config around that option and make it depend on
CONFIG_BROKEN.

That said; I'm thinking it shouldn't be too hard to fix up the core
state before we add the CPU to the masks, but it will be arch specific.
See speculative_store_bypass_ht_init() for inspiration, but you'll need
to be even earlier, before set_cpu_sibling_map() in smp_callin() on x86
(no clue about other archs).

Even without maxcpus= this can happen when you do physical hotplug and
add a part (or replace one where the new part has more cores than the
old).

The moment core-scheduling is enabled and you're adding unknown
topology, we need to set up state before we publish the mask,... or I
suppose endlessly do: 'smt_mask & active_mask' all over the place :/ In
which case you can indeed do it purely in sched/core.

Hurmph...