Re: [GIT PULL] Re: REGRESSION: Performance regressions fromswitching anon_vma->lock to mutex

From: Ingo Molnar
Date: Thu Jun 16 2011 - 19:03:58 EST



* Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:

> On Thu, Jun 16, 2011 at 10:25:50PM +0200, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > > The funny thing about this workload is that context-switches are
> > > > really a fastpath here and we are using anonymous IRQ-triggered
> > > > softirqs embedded in random task contexts as a workaround for
> > > > that.
> > >
> > > The other thing that the IRQ-triggered softirqs do is to get the
> > > callbacks invoked in cases where a CPU-bound user thread is never
> > > context switching.
> >
> > Yeah - but this workload didnt have that.
> >
> > > Of course, one alternative might be to set_need_resched() to force
> > > entry into the scheduler as needed.
> >
> > No need for that: we can just do the callback not in softirq but in
> > regular syscall context in that case, in the return-to-userspace
> > notifier. (see TIF_USER_RETURN_NOTIFY and the USER_RETURN_NOTIFIER
> > facility)
> >
> > Abusing a facility like setting need_resched artificially will
> > generally cause trouble.
>
> If the task enqueued callbacks in the kernel, thus started a new
> grace period, it might return to userspace before every CPUs have
> completed that grace period, and you need that full completion to
> happen before invoking the callbacks.
>
> I think you need to keep the tick in such case because you can't
> count on the other CPUs to handle that completion as they may be
> all idle.
>
> So when you resume to userspace and you started a GP, either you
> find another CPU to handle the GP completion and callbacks
> executions, or you keep the tick until you are done.

We'll have a scheduler tick in any case, which will act as a
worst-case RCU tick.

My main point is that we need to check whether this solution improves
performance over the current softirq code. I think there's a real
chance that it improves things like VFS workloads, because it
provides (much!) lower grace period latencies hence provides
fundamentally better cache locality.

If a workload pays the cost of frequent scheduling then it might as
well use a beneficial side-effect of that scheduling: high-freq grace
periods ...

If it improves performance we can figure out all the loose ends. If
it doesnt then the loose ends are not worth worrying about.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/