Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

From: Matt Fleming
Date: Tue May 17 2016 - 08:24:23 EST


On Tue, 17 May, at 04:11:09AM, Yuyang Du wrote:
> On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote:
> >
> > No because if someone calls rq_clock() immediately after __schedule(),
> > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
> > should trigger a warning since the clock has not actually been
> > updated.
>
> Well, I don't know how concurrent it can be, but aren't both update
> and read synchronized by rq->lock? So I don't understand the latter
> case, and the former should be addressed (missing its own update?).

I'm not talking about concurrency; when I said "someone" above, I was
referring to code.

So, if the code looks like the following, either now or in the future,

static void __schedule(bool preempt)
{
...
/* Clear RQCF_ACT_SKIP */
rq->clock_update_flags = 0;
...
delta = rq_clock();
}

I would expect to see a warning triggered, because we've read the rq
clock outside of the code area where we know it's safe to do so
without a clock update.

The solution for that bug may be as simple as rearranging the code,

delta = rq_clock();
...
rq->clock_update_flags = 0;

but we definitely want to catch such bugs in the first instance.