Re: [PATCH 4/6] nohz: support PR_DATAPLANE_QUIESCE
From: Peter Zijlstra
Date: Tue May 12 2015 - 06:38:29 EST
On Tue, May 12, 2015 at 11:50:30AM +0200, Ingo Molnar wrote:
>
> * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > On Fri, May 08, 2015 at 01:58:45PM -0400, Chris Metcalf wrote:
> > > This prctl() flag for PR_SET_DATAPLANE sets a mode that requires the
> > > kernel to quiesce any pending timer interrupts prior to returning
> > > to userspace. When running with this mode set, sys calls (and page
> > > faults, etc.) can be inordinately slow. However, user applications
> > > that want to guarantee that no unexpected interrupts will occur
> > > (even if they call into the kernel) can set this flag to guarantee
> > > that semantics.
> >
> > Currently people hot-unplug and hot-plug the CPU to do this.
> > Obviously that's a wee bit horrible :-)
> >
> > Not sure if a prctl like this is any better though. This is a CPU
> > properly not a process one.
>
> So if then a prctl() (or other system call) could be a shortcut to:
>
> - move the task to an isolated CPU
> - make sure there _is_ such an isolated domain available
>
> I.e. have some programmatic, kernel provided way for an application to
> be sure it's running in the right environment. Relying on random
> administration flags here and there won't cut it.
No, we already have sched_setaffinity() and we should not duplicate its
ability to move tasks about.
What this is about is 'clearing' CPU state, its nothing to do with
tasks.
Ideally we'd never have to clear the state because it should be
impossible to get into this predicament in the first place.
The typical example here is a periodic timer that found its way onto the
cpu and stays there. We're actually working on allowing such self arming
timers to migrate, so once we have that sorted this could be fixed
proper I think.
Not sure if there's more pollution that people worry about.
The hotplug hack worked because unplug force migrates the timers away.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/