Re: [PATCH] thermal: core: fix use-after-free due to init/cancel delayed_work race

From: Rafael J. Wysocki

Date: Wed Mar 25 2026 - 09:01:32 EST


On Wed, Mar 25, 2026 at 12:51 AM Mauricio Faria de Oliveira
<mfo@xxxxxxxxxx> wrote:
>
> If INIT_DELAYED_WORK() is called for a currently running work item,
> cancel_delayed_work_sync() is unable to cancel/wait for it anymore,
> as the work item's data bits required for that are cleared.
>
> In the resume path, INIT_DELAYED_WORK() is called twice:
> 1) to replace the work function: thermal_zone_device_check/resume()
> 2) to restore it.
>
> Both cases might race with the unregister path and bypass the call to
> cancel_delayed_work_sync(),

So this is the problem, isn't it?

> after which struct thermal_zone_device *tz
> is freed, and the non-canceled/non-waited for work hits use-after-free.

Which basically means that a TZ_STATE_FLAG_EXIT check is missing in
both thermal_zone_pm_complete() and thermal_zone_device_resume().

> Fix the first case with a dedicated work item for the resume function,
> and the second case by initializing the work item(s) only during init.

So why not add those missing checks instead of complicating the code even more?