Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

From: Rik van Riel
Date: Mon Jul 30 2018 - 10:30:31 EST


On Mon, 2018-07-30 at 11:55 +0200, Peter Zijlstra wrote:
> On Sun, Jul 29, 2018 at 03:54:52PM -0400, Rik van Riel wrote:
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index c45de46fdf10..11724c9e88b0 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2691,7 +2691,7 @@ static struct rq *finish_task_switch(struct
> > task_struct *prev)
> > */
> > if (mm) {
> > membarrier_mm_sync_core_before_usermode(mm);
> > - mmdrop(mm);
> > + drop_lazy_mm(mm);
> > }
> > if (unlikely(prev_state == TASK_DEAD)) {
> > if (prev->sched_class->task_dead)
> > @@ -2805,7 +2805,7 @@ context_switch(struct rq *rq, struct
> > task_struct *prev,
> > */
> > if (!mm) {
> > next->active_mm = oldmm;
> > - mmgrab(oldmm);
> > + grab_lazy_mm(oldmm);
> > enter_lazy_tlb(oldmm, next);
> > } else
> > switch_mm_irqs_off(oldmm, mm, next);
>
> What happened to the rework I did there? That not only avoided
> fiddling
> with active_mm, but also avoids grab/drop cycles for the other
> architectures when doing task->kthread->kthread->task things.

I don't think I saw that. I only saw your email from
July 20th with this fragment of code, which does not
appear to avoid the grab/drop cycles, and still fiddles
with active_mm:

Date: Fri, 20 Jul 2018 11:32:39 +0200
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Subject: Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier
Message-ID: <20180720093239.GO2494@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>

+ /*
+ * kernel -> kernel lazy + transfer active
+ * user -> kernel lazy + mmgrab() active
+ *
+ * kernel -> user switch + mmdrop() active
+ * user -> user switch
+ */
+ if (!next->mm) { // to kernel
+ enter_lazy_tlb(prev->active_mm, next);
+
+#ifdef ARCH_NO_ACTIVE_MM
+ next->active_mm = prev->active_mm;
+ if (prev->mm) // from user
+ mmgrab(prev->active_mm);
+#endif
+ } else { // to user
+ switch_mm_irqs_off(prev->active_mm, next->mm, next);
+
+#ifdef ARCH_NO_ACTIVE_MM
+ if (!prev->mm) { // from kernel
+ /* will mmdrop() in finish_task_switch(). */
+ rq->prev_mm = prev->active_mm;
+ prev->active_mm = NULL;
+ }
+#endif

What email should I look for to find the thing you
referenced above?

> I agree with Andy that if you avoid the refcount fiddling, then you
> should also not muck with active_mm.
>
> That is, if you keep active_mm for now (which seems a reasonable
> first
> step) then at least ensure you keep ->mm == ->active_mm at all times.

There do not seem to be a lot of places left in
arch/x86/ that reference active_mm. I guess the
next patch series should excise those? :)

--
All Rights Reversed.

Attachment: signature.asc
Description: This is a digitally signed message part