Re: [patch] sched: Fix clock_gettime(CLOCK_[PROCESS/THREAD]_CPUTIME_ID) monotonicity

From: Mike Galbraith
Date: Mon Jun 23 2014 - 13:07:32 EST


On Mon, 2014-06-23 at 17:14 +0200, Peter Zijlstra wrote:
> On Mon, Jun 16, 2014 at 11:16:52AM +0200, Mike Galbraith wrote:
> > (disregard patch of same name from that enterprise weenie;)
> >
> > If a task has been dequeued, it has been accounted. Do not project
> > cycles that may or may not ever be accounted to a dequeued task, as
> > that may make clock_gettime() both inaccurate and non-monotonic.
> >
> > Protect update_rq_clock() from slight TSC skew while at it.
> >
> > Signed-off-by: Mike Galbraith <umgwanakikbuti@xxxxxxxxx>
> > ---
> > kernel/sched/core.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -144,6 +144,8 @@ void update_rq_clock(struct rq *rq)
> > return;
> >
> > delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
> > + if (delta < 0)
> > + return;
> > rq->clock += delta;
> > update_rq_clock_task(rq, delta);
> > }
>
> Have you actually observed this? If TSC is stable this should not
> happen, if TSC is not stable we should be using kernel/sched/clock.c
> which should also avoid this, because while sched_clock_cpu(x) -
> sched_clock_cpu(y) < 0 is possible, sched_clock_cpu(x) -
> sched_clock_cpu(x) should always be >= 0.
>
> I suppose it can happen when the TSC gets screwed and we haven't
> switched to the slow path yet.

Seen, no. I saw it while eyeballing, and thought making it impossible
for a dinky offset to create a negative delta would be a better plan
than hoping such don't exist in the wild.

> > @@ -2533,7 +2535,12 @@ static u64 do_task_delta_exec(struct tas
> > {
> > u64 ns = 0;
> >
> > - if (task_current(rq, p)) {
> > + /*
> > + * Must be ->curr, ->on_cpu _and_ ->on_rq. If dequeued, we
> > + * would project cycles that may never be accounted to this
> > + * thread, breaking clock_gettime().
> > + */
> > + if (task_current(rq, p) && p->on_cpu && p->on_rq) {
>
> do we still need ->on_cpu in this case?

Damn, I had a reason for leaving that, but seems no now.

I don't see why we need to play the accounting projection game at all
really. If what's accounted at call time isn't good enough, we should
be doing real accounting, not prognostication.

Anyway, I'll retest with the on_cpu dropped, which should either make me
remember why I felt it should stay, or give me cause to make it gone :)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/