Re: [PATCH v3 (repost)] workqueue: Warn flushing of kernel-global workqueues

From: Tetsuo Handa
Date: Thu May 12 2022 - 07:32:52 EST


On 2022/05/12 19:38, Dmitry Torokhov wrote:
> Hi Tejun,
>
> On Mon, Mar 21, 2022 at 07:02:45AM -1000, Tejun Heo wrote:
>> I'm willing to bet that the majority of the use cases can be converted to
>> use flush_work() and that'd be the preference. We need a separate workqueue
>> iff the flush requrement is complex (e.g. there are multiple dynamic work
>> items in flight which need to be flushed together) or the work items needs
>> some special attributes (such as MEM_RECLAIM or HIGHPRI) which don't apply
>> to the system_wq users in the first place.
>
> This means that now the code has to keep track of all work items that it
> allocated, instead of being able "fire and forget" works (when dealing
> with extremely infrequent events) and rely on flush_workqueue() to
> cleanup.

Yes. Moreover, a patch to catch and refuse at compile time was proposed at
https://lkml.kernel.org/r/738afe71-2983-05d5-f0fc-d94efbdf7634@xxxxxxxxxxxxxxxxxxx .

> That flush typically happens in module unload path, and I
> wonder if the restriction on flush_workqueue() could be relaxed to allow
> calling it on unload.

A patch for drivers/input/mouse/psmouse-smbus.c is waiting for your response at
https://lkml.kernel.org/r/25e2b787-cb2c-fb0d-d62c-6577ad1cd9df@xxxxxxxxxxxxxxxxxxx .
Like many modules, flush_workqueue() happens on only module unload in your case.

We currently don't have a flag to tell whether the caller is inside module unload
path. And even inside module unload path, flushing the system-wide workqueue is
problematic under e.g. GFP_NOFS/GFP_NOIO context. Therefore, I don't think that
the caller is inside module unload path as a good exception.

Removing flush_scheduled_work() is for proactively avoiding new problems like
https://lkml.kernel.org/r/385ce718-f965-4005-56b6-34922c4533b8@xxxxxxxxxxxxxxxxxxx
and https://lkml.kernel.org/r/20220225112405.355599-10-Jerome.Pouiller@xxxxxxxxxx .

Using local WQ also helps for documentation purpose.
This change makes clear where the work's dependency is.
Please grep the linux-next.git tree. Some have been already converted.

Any chance you have too many out-of-tree modules to convert?