Re: [PATCH 2/7] nohz: New tick dependency mask
From: Frederic Weisbecker
Date: Tue Dec 01 2015 - 17:20:37 EST
On Tue, Dec 01, 2015 at 09:41:09PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 13, 2015 at 03:22:04PM +0100, Frederic Weisbecker wrote:
> > The tick dependency is evaluated on every IRQ. This is a batch of checks
> > which determine whether it is safe to stop the tick or not. These checks
> > are often split in many details: posix cpu timers, scheduler, sched clock,
> > perf events. Each of which are made of smaller details: posix cpu
> > timer involves checking process wide timers then thread wide timers. Perf
> > involves checking freq events then more per cpu details.
> >
> > Checking these details asynchronously every time we update the full
> > dynticks state bring avoidable overhead and a messy layout.
> >
> > Lets introduce instead tick dependency masks: one for system wide
> > dependency (unstable sched clock), one for CPU wide dependency (sched,
> > perf), and task/signal level dependencies. The subsystems are responsible
> > of setting and clearing their dependency through a set of APIs that will
> > take care of concurrent dependency mask modifications and kick targets
> > to restart the relevant CPU tick whenever needed.
>
> Maybe better explain why we need the per task and per signal thingy?
I'll detail that some more in the changelog. The only user of the per task/per signal
tick dependency is posix cpu timer. I've been first proposing a global tick dependency
as soon as any posix cpu timer is armed. It simplified everything but some reviewers
complained (eg: some users might want to run posix timers on housekeepers without
bothering full dynticks CPUs). I could remove the per signal dependency with dispatching
it through all threads in the group each time there is an update but that's the best I can
think of.
>
> > +static void trace_tick_dependency(unsigned long dep)
> > +{
> > + if (dep & TICK_POSIX_TIMER_MASK) {
> > + trace_tick_stop(0, "posix timers running\n");
> > + return;
> > + }
> > +
> > + if (dep & TICK_PERF_EVENTS_MASK) {
> > + trace_tick_stop(0, "perf events running\n");
> > + return;
> > + }
> > +
> > + if (dep & TICK_SCHED_MASK) {
> > + trace_tick_stop(0, "more than 1 task in runqueue\n");
> > + return;
> > + }
> > +
> > + if (dep & TICK_CLOCK_UNSTABLE_MASK)
> > + trace_tick_stop(0, "unstable sched clock\n");
> > +}
>
> I would suggest ditching the strings and using the
Using a code value instead?
>
> > +static void kick_all_work_fn(struct work_struct *work)
> > +{
> > + tick_nohz_full_kick_all();
> > +}
> > +static DECLARE_WORK(kick_all_work, kick_all_work_fn);
> > +
> > +void __tick_nohz_set_dep_delayed(enum tick_dependency_bit bit, unsigned long *dep)
> > +{
> > + unsigned long prev;
> > +
> > + prev = fetch_or(dep, BIT_MASK(bit));
> > + if (!prev) {
> > + /*
> > + * We need the IPIs to be sent from sane process context.
>
> Why ?
Because posix timers code is all called with interrupts disabled and we can't
send IPIs then.
>
> > + * The posix cpu timers are always set with irqs disabled.
> > + */
> > + schedule_work(&kick_all_work);
> > + }
> > +}
> > +
> > +/*
> > + * Set a global tick dependency. Lets do the wide IPI kick asynchronously
> > + * for callers with irqs disabled.
>
> This seems to suggest you can call this with IRQs disabled
Ah right, that's a misleading comment. We need to use the _delayed() version
when interrupts are disabled.
Thanks.
>
> > + */
> > +void tick_nohz_set_dep(enum tick_dependency_bit bit)
> > +{
> > + unsigned long prev;
> > +
> > + prev = fetch_or(&tick_dependency, BIT_MASK(bit));
> > + if (!prev)
> > + tick_nohz_full_kick_all();
>
> But that function seems implemented using smp_call_function_many() which
> cannot be called with IRQs disabled.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/