Re: [RFC] [PATCH] Pre-emption control for userspace

From: Thomas Gleixner
Date: Thu Mar 06 2014 - 06:15:27 EST


On Wed, 5 Mar 2014, Khalid Aziz wrote:
> On 03/05/2014 04:10 AM, Peter Zijlstra wrote:
> > On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote:
> > > Anything else?
> >
> > Proxy execution; its a form of PI that works for arbitrary scheduling
> > policies (thus also very much including fair).
> >
> > With that what you effectively end up with is the lock holder running
> > 'boosted' by the runtime of its blocked chain until the entire chain
> > runs out of time, at which point preemption doesn't matter anyhow.
> >
>
> Hello Peter,
>
> I read through the concept of proxy execution and it is a very interesting
> concept. I come from many years of realtime and embeddded systems development
> and I can easily recall various problems in the past that can be solved or
> helped by this. Looking at the current problem I am trying to solve with
> databases and JVM, I run into the same issue I described in my earlier email.
> Proxy execution is a post-contention solution. By the time proxy execution can
> do something for my case, I have already paid the price of contention and a
> context switch which is what I am trying to avoid. For a critical section that
> is very short compared to the size of execution thread, which is the case I am
> looking at, avoiding preemption in the middle of that short critical section
> helps much more than dealing with lock contention later on. The goal here is
> to avoid lock contention and associated cost. I do understand the cost of
> dealing with lock contention poorly and that can easily be much bigger cost,
> but I am looking into avoiding even getting there.

We understand that you want to avoid preemption in the first place and
not getting into the contention handling case.

But, what you're trying to do is essentially creating an ABI which we
have to support and maintain forever. And that definitely is worth a
few serious questions.

Lets ignore the mm related issues for now as those can be solved. That's
the least of my worries.

Right now you are using this for a single use case with a well defined
environment, where all related threads reside in the same scheduling
class (FAIR). But that's one of a gazillion of use cases of Linux.

If we allow you to special case your database workload then we have no
argument why we should not do the same thing for realtime workloads
where the SCHED_FAIR housekeeping thread can hold a lock shortly to
access some important data in the SCHED_FIFO realtime computation
thread. Of course the RT people want to avoid the lock contention as
much as you do, just for different reasons.

Add SCHED_EDF, cgroups and hierarchical scheduling to the picture and
hell breaks lose.

Why? Simply because you applied the everything is a nail therefor I
use a hammer approach. Understandable because in your case (data base
workload) almost everything is the same nail.

Thanks,

tglx










--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/