Re: [PATCH 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism

From: Tejun Heo
Date: Thu May 11 2023 - 18:53:00 EST


On Thu, May 11, 2023 at 11:26:06PM +0200, Peter Zijlstra wrote:
> On Thu, May 11, 2023 at 08:19:30AM -1000, Tejun Heo wrote:
> > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > prevents stalling other concurrency-managed work items. If a work function
> > keeps running over the thershold, it likely needs to be switched to use an
> > unbound workqueue.
> >
> > This patch adds a debug mechanism which tracks the work functions which
> > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > pr_warn() with exponential backoff.
> >
> > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> I did do wonder why you chose for external storage for this -- I figured
> it was to keep the cost down since it shouldn't really be happening, so
> storage in the normal data structures is a waste etc..?

The only releveant data structures are workqueue and work_struct. The former
is too coarse because a given workqueue can run any number of different work
items (e.g. system_wq). The latter is too transient to record anything on.
In a lot of cases, the only meaningfully identifiable thing is the work
function pointer, which doesn't have any data structure attached by default,