Re: [PATCH 11/32] nohz/cpuset: Don't turn off the tick if rcu needsit
From: Frederic Weisbecker
Date: Wed Mar 28 2012 - 09:38:45 EST
On Wed, Mar 28, 2012 at 02:57:44PM +0200, Gilad Ben-Yossef wrote:
> On Wed, Mar 28, 2012 at 2:39 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> > On Tue, Mar 27, 2012 at 05:21:34PM +0200, Gilad Ben-Yossef wrote:
> >> On Thu, Mar 22, 2012 at 6:18 PM, Christoph Lameter <cl@xxxxxxxxx> wrote:
> >> > On Thu, 22 Mar 2012, Gilad Ben-Yossef wrote:
> >> >
> >> >> > Is there any way for userspace to know that the tick is not off yet due to
> >> >> > this? It would make sense for us to have busy loop in user space that
> >> >> > waits until the OS has completed all processing if that avoids future
> >> >> > latencies for the application.
> >> >> >
> >> >>
> >> >> I previously suggested having the user register to receive a signal
> >> >> when the tick
> >> >> is turned off. Since the tick is always turned off the user task is
> >> >> the current task
> >> >> by design, *I think* you can simply mark the signal pending when you
> >> >> turn the tick off.
> >> >
> >> > Ok that sounds good. You would define a new signal for this?
> >> >
> >> My gut instinct is to let the process register with a specific signal
> >> (properly the RT range)
> >> it wants to receive when the tick goes off and/or on.
> > Note the signal itself could trigger an event that could restart the tick.
> > Calling call_rcu() is sufficient for that. We can probably optimize that
> > one day by assigning another CPU to handle the callbacks of a tickless
> > CPU but for now...
> >> > So we would startup the application. App will do all prep work (memory
> >> > allocation, device setup etc etc) and then wait for the signal to be
> >> > received. After that it would enter the low latency processing phase.
> >> >
> >> > Could we also get a signal if something disrupts the peace and switches
> >> > the timer interrupt on again?
> >> >
> >> I think you'll have to since once you have the tick turned off there
> >> is no guarantee that
> >> it wont get turned on by a timer scheduling an task or an IPI.
> > The problem with this scheme is that if the task is running with the
> > guarantee that nothing is going to disturb it (it assumes so when it
> > is notified that the timer is stopped), can it seriously recover from
> > the fact the timer has been restarted once it gets notified about it?
> Recovery in this context involves a programmer/system architect looking
> into what made the tick start and making sure that wont happen the next
> time around.
> I know it's not quite what you had in mind, but it works :-)
So this is about fixing bugs. Tracing may fit better for that.
> > I have a hard time to imagine that. It's like an RT task running a
> > critical part that suddenly receives a notification from the kernel that
> > says "what's up dude? hey by the way you're not real time anymore" :)
> > How are we recovering from that?
> The point is that it is the difference between a QA report that says:
> "Performance dropped below acceptable level for 10 ms some when
> during the test run"
> "We got an indication that the kernel resumed the tick on us, so the test
> was stopped and here is the stack trace for all the tasks running,
> plus the logs".
That's about post run analysis, that's sounds to be a job for tracing.
> > May be instead of focusing on these notifications, we should try hard to
> > shut down the tick before we reach userspace: delegate RCU work
> > to another CPU, avoid needless IPIs, avoid needless timer list timers, etc...
> > Fix those things one by one such that we can configure things to the point we
> > get closer to a guarantee of CPU isolation.
> > Does that sound reasonable?
> It does to me :-)
> Gilad Ben-Yossef
> Chief Coffee Drinker
> Israel Cell: +972-52-8260388
> US Cell: +1-973-8260388
> "If you take a class in large-scale robotics, can you end up in a
> situation where the homework eats your dog?"
> -- Jean-Baptiste Queru
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/