Re: [RFC PATCH] workqueue: introduce queue_delayed_work_on_offline_safe.

From: imran . f . khan
Date: Mon Jan 13 2025 - 22:01:46 EST


Hello Tejun,
Thanks for taking a look into it.
On 14/1/2025 5:21 am, Tejun Heo wrote:
> On Mon, Jan 13, 2025 at 03:35:40PM +1100, Imran Khan wrote:
> ...
>> I have kept the patch as RFC because from mailing list,
>> I could not find any users, of queue_delayed_work_on,
>> that is ending up queuing dwork on an offlined CPU.
>> We have some in-house code that is running into this problem,
>> and currently we are fixing it on caller side of queue_delayed_work_on.
>> Other users who run into this issue, can also use the approach of
>> fixing it on caller side or we can use the interface introduced
>> here for such use cases.
>
> I'm not sure how necessary this is. If the timer is okay to run on other
> CPUs, might as well just use queue_delayed_work().
>

Yes, right now I can't locate something in upstream kernel that gets
broken due to the issue mentioned here.
All (except 3, mentioned further down) users of queued_delayed_work_on
are using smp_processor_id(), to specify the CPU. So specified CPU can't
be an already offlined CPU.

I see below 3 files (in v6.12.6), using queue_delayed_work_on with some sort of cached
cpu information:

drivers/net/ethernet/pensando/ionic/ionic_dev.c -> line 177
drivers/scsi/esas2r/esas2r_main.c -> line 1858
drivers/scsi/lpfc/lpfc_sli.c
-> line 14987
-> line 15381
But looks like in these cases specified CPU remains online or
they simply have not encountered the issue mentioned here.


For this patch, yes the timer is okay to run on other CPUs but that is
only as a last resort, most of the times it could still run on specified
CPU (assuming its online)

Thanks,
Imran



> Thanks.
>