Re: [PATCH misc 1/2] workqueue: Add check for clocks going backwards to wq_worker_tick()
From: Paul E. McKenney
Date: Fri Aug 02 2024 - 12:39:43 EST
On Fri, Aug 02, 2024 at 12:06:06PM -0400, Rik van Riel wrote:
> On Thu, 2024-08-01 at 17:30 -0700, Paul E. McKenney wrote:
> >
> > +++ b/kernel/workqueue.c
> > @@ -1482,6 +1482,7 @@ void wq_worker_tick(struct task_struct *task)
> > * If the current worker is concurrency managed and hogged
> > the CPU for
> > * longer than wq_cpu_intensive_thresh_us, it's
> > automatically marked
> > * CPU_INTENSIVE to avoid stalling other concurrency-managed
> > work items.
> > + * If the time is negative, ignore, assuming a backwards
> > clock.
> > *
> > * Set @worker->sleeping means that @worker is in the
> > process of
> > * switching out voluntarily and won't be contributing to
> > @@ -1491,6 +1492,7 @@ void wq_worker_tick(struct task_struct *task)
> > * We probably want to make this prettier in the future.
> > */
> > if ((worker->flags & WORKER_NOT_RUNNING) ||
> > READ_ONCE(worker->sleeping) ||
> > + WARN_ON_ONCE((s64)(worker->task->se.sum_exec_runtime -
> > worker->current_at) < 0) ||
> > worker->task->se.sum_exec_runtime - worker->current_at <
> > wq_cpu_intensive_thresh_us * NSEC_PER_USEC)
> > return;
>
> What is the code path by which sum_exec_runtime could go backward
> in time, if the TSC and sched_clock() jump backward?
>
> Might it make sense to check in the place where sum_exec_runtime is
> updated, instead, and catch a wider net?
>
> On the flip side, the run time increments are "fairly large" in
> number of TSC cycles, while most of the negative TSC jumps we
> have seen are quite small, so even that wider net might not catch
> much because of how coarse these updates typically are...
Good points! Even more telling, this patch didn't catch anything during
Breno's tests. I will drop it with a workqueue.2024.08.01 branch in
-rcu in case someone needs it later.
Thanx, Paul
> All Rights Reversed.