Re: [PATCH] driver core: Use unbound workqueue for deferred probes

From: Yogesh Lal
Date: Mon Mar 15 2021 - 06:46:04 EST




On 2/25/2021 5:14 PM, Greg KH wrote:
On Thu, Feb 25, 2021 at 04:03:50PM +0530, Yogesh Lal wrote:
Hi Greg,


On 2/24/2021 6:13 PM, Greg KH wrote:
On Wed, Feb 24, 2021 at 05:25:49PM +0530, Yogesh Lal wrote:
Queue deferred driver probes on unbounded workqueue, to allow
scheduler better manage scheduling of long running probes.

Really? What does this change and help? What is the visable affect of
this patch? What problem does it solve?


We observed boot up improvement (~400 msec) when the deferred probe work is
made unbound. This is due to scheduler moving the worker running deferred
probe work to big CPUs. without this change, we see the worker is running on
LITTLE CPU due to affinity.

Why is none of this information in the changelog text? How are we
supposed to know this? And is this 400msec out of 10 seconds or

We wanted to first understand the requirement of bounded deferred probe why it was really required.

something else? Also, this sounds like your "little" cpus are really
bad, you might want to look into fixing them first :)


~600ms (deferred probe bound to little core) and ~200ms (deferred probe queued on unbound wq).

But if you really want to make this go faster, do not deferr your probe!
Why not fix that problem in your drivers instead?


Yes, we are exploring in that direction as well but want to get upstream opinion and understand the usability of unbounded wq.

Please let us now if there are any concerns/restrictions that deferred probe
work should run only on pinned kworkers. Since this work runs deferred probe
of several devices , the locality may not be that important

Can you prove that it is not important? I know lots of gyrations are
done in some busses to keep probe happening on the same CPU for very
good reasons. Changing that should not be done lightly as you will
break this.

While debugging further and checking if probe are migrating found that init thread can potentially migrate, as it has cpu affinity set to all cpus, during driver probe (or there is something which prevents it, which I am missing?) . Also, async probes use unbounded workqueue.
So, using unbounded wq for deferred probes looks to be similar to these, w.r.t. scheduling behavior.



thanks,

greg k-h


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation