Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT

From: Mel Gorman
Date: Thu Nov 19 2020 - 07:14:40 EST


On Thu, Nov 19, 2020 at 12:14:11PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 19, 2020 at 09:38:34AM +0000, Mel Gorman wrote:
> > On Wed, Nov 18, 2020 at 08:48:42PM +0100, Thomas Gleixner wrote:
> > > From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > >
> > > Now that the scheduler can deal with migrate disable properly, there is no
> > > real compelling reason to make it only available for RT.
> > >
> > > There are quite some code pathes which needlessly disable preemption in
> > > order to prevent migration and some constructs like kmap_atomic() enforce
> > > it implicitly.
> > >
> > > Making it available independent of RT allows to provide a preemptible
> > > variant of kmap_atomic() and makes the code more consistent in general.
> > >
> > > FIXME: Rework the comment in preempt.h - Peter?
> > >
> >
> > I didn't keep up to date and there is clearly a dependency on patches in
> > tip for migrate_enable/migrate_disable . It's not 100% clear to me what
> > reworking you're asking for but then again, I'm not Peter!
>
> He's talking about the big one: "Migrate-Disable and why it is
> undesired.".
>

Ah yes, that makes more sense. I was thinking in terms of what is protected
but the PREEMPT_RT hazard is severe.

> I still hate all of this, and I really fear that with migrate_disable()
> available, people will be lazy and usage will increase :/
>
> Case at hand is this series, the only reason we need it here is because
> per-cpu page-tables are expensive...
>

I guessed, it was the only thing that made sense.

> I really do think we want to limit the usage and get rid of the implicit
> migrate_disable() in spinlock_t/rwlock_t for example.
>
> AFAICT the scenario described there is entirely possible; and it has to
> show up for workloads that rely on multi-cpu bandwidth for correctness.
>
> Switching from preempt_disable() to migrate_disable() hides the
> immediate / easily visible high priority latency, but you move the
> interference term into a place where it is much harder to detect, you
> don't lose the term, it stays in the system.
>
> So no, I don't want to make the comment less scary. Usage is
> discouraged.

More scary then by adding this to the kerneldoc section for
migrate_disable?

* Usage of migrate_disable is heavily discouraged as it is extremely
* hazardous on PREEMPT_RT kernels and any usage needs to be heavily
* justified. Before even thinking about using this, read
* "Migrate-Disable and why it is undesired" in
* include/linux/preempt.h and include both a comment and document
* in the changelog why the use case is an exception.

It's not necessary for the current series because the interface hides
it and anyone poking at the internals of kmap_atomic probably should be
aware of the address space and TLB hazards associated with it. There are
few in-tree users and presumably any future preempt-rt related merges
already know why migrate_disable is required.

However, with the kerneldoc, there is no excuse for missing it for new
users that are not PREEMPT_RT-aware. It makes it easier to NAK/revert a
patch without proper justification similar to how undocumented usages of
memory barriers tend to get a poor reception.

--
Mel Gorman
SUSE Labs