Re: [PATCH v2] sched: Warn on long periods of pending need_resched

From: Josh Don
Date: Thu Mar 25 2021 - 17:51:48 EST


On Wed, Mar 24, 2021 at 4:27 AM Mel Gorman <mgorman@xxxxxxx> wrote:
>
> I'm not a fan of the name. I know other sysctls have _enabled in the
> name but it's redundant. If you say the name out loud, it sounds weird.
> I would suggest an alternative but see below.

Now using the version rebased by Peter; this control has gone away and
we have simply a scheduling feature "LATENCY WARN"

> I suggest semantics and naming similar to hung_task_warnings
> because it's sortof similar. resched_latency_warnings would combine
> resched_latency_warn_enabled and resched_latency_warn_once. 0 would mean
> "never warn", -1 would mean always warn and any positive value means
> "warn X number of times".

See above. I'm happy with the enabled bit being toggled separately by
a sched feature; the warn_once behavior is not overloaded with the
enabling/disabling. Also, I don't see value in "warn X number of
times", given the warning is rate limited anyway.

> > +int sysctl_resched_latency_warn_ms = 100;
> > +int sysctl_resched_latency_warn_once = 1;
>
> Use __read_mostly

Good point, done.

> > +#ifdef CONFIG_SCHED_DEBUG
> > +static u64 resched_latency_check(struct rq *rq)
> > +{
> > + int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
> > + u64 need_resched_latency, now = rq_clock(rq);
> > + static bool warned_once;
> > +
> > + if (sysctl_resched_latency_warn_once && warned_once)
> > + return 0;
> > +
>
> That is a global variable that can be modified in parallel and I do not
> think it's properly locked (scheduler_tick is holding rq lock which does
> not protect this).
>
> Consider making resched_latency_warnings atomic and use
> atomic_dec_if_positive. If it drops to zero in this path, disable the
> static branch.
>
> That said, it may be overkill. hung_task_warnings does not appear to have
> special protection that prevents it going to -1 or lower values by accident
> either. Maybe it can afford to be a bit more relaxed because a system that
> is spamming hung task warnings is probably dead or might as well be dead.

There's no real issue if we race over modification to that sysctl.
This is intentionally not more strongly synchronized for that reason.

> > + if (!need_resched() || WARN_ON_ONCE(latency_warn_ms < 2))
> > + return 0;
> > +
>
> Why is 1ms special? Regardless of the answer, if the sysctl should not
> be 1 then the user should not be able to set it to 1.

Yea let me change that to !latency_warn_ms so it isn't HZ specific.

>
> > + /* Disable this warning for the first few mins after boot */
> > + if (now < resched_boot_quiet_sec * NSEC_PER_SEC)
> > + return 0;
> > +
>
> Check system_state == SYSTEM_BOOTING instead?

Yea, that might be better; let me test that.

> > + if (!rq->last_seen_need_resched_ns) {
> > + rq->last_seen_need_resched_ns = now;
> > + rq->ticks_without_resched = 0;
> > + return 0;
> > + }
> > +
> > + rq->ticks_without_resched++;
> > + need_resched_latency = now - rq->last_seen_need_resched_ns;
> > + if (need_resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
> > + return 0;
> > +
>
> The naming need_resched_latency implies it's a boolean but it's not.
> Maybe just resched_latency?
>
> Similarly, resched_latency_check implies it returns a boolean but it
> returns an excessive latency value. At this point I've been reading the
> patch for a long time so I've ran out of naming suggestions :)

The "need_" part does confuse it a bit; I reworded these to hopefully
make it more clear.

> > + warned_once = true;
> > +
> > + return need_resched_latency;
> > +}
> > +
>
> I note that you split when a warning is needed and printing the warning
> but it's not clear why. Sure you are under the RQ lock but there are other
> places that warn under the RQ lock. I suppose for consistency it could
> use SCHED_WARN_ON even though all this code is under SCHED_DEBUG already.

We had seen a circular lock dependency warning (console_sem, pi lock,
rq lock), since printing might need to wake a waiter. However, I do
see plenty of warns under rq->lock, so maybe I missed a patch to
address this?