Re: [PATCH] fs/resctrl: Fix use-after-free in resctrl_offline_mon_domain()

From: Reinette Chatre

Date: Thu May 07 2026 - 13:06:24 EST

Hi Tony,

On 5/7/26 8:48 AM, Luck, Tony wrote:
> On Wed, May 06, 2026 at 11:24:30AM -0700, Reinette Chatre wrote:
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 02f87c4bc03c..cc8620ace7ed 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -4539,8 +4539,19 @@ void resctrl_offline_cpu(unsigned int cpu)
>> d = get_mon_domain_from_cpu(cpu, l3);
>> if (d) {
>> if (resctrl_is_mbm_enabled() && cpu == d->mbm_work_cpu) {
>> - cancel_delayed_work(&d->mbm_over);
>> - mbm_setup_overflow_handler(d, 0, cpu);
>> + if (cancel_delayed_work(&d->mbm_over)) {
>> + mbm_setup_overflow_handler(d, 0, cpu);
>
> Per your comment[1] should this "0" also be MBM_OVERFLOW_INTERVAL?
>
> Does the same "delay 0 is magic, ignore the cpu argument and run right away" apply?

More specifically a 0 delay means the work is *queued* (not run) right away. The
distinction is important here since the queuing logic has a "non-reentrance"
guarantee that may change where work is queued depending on whether the work is
currently executing.

To better understand this I found the following comments and surrounding code insightful:

kernel/workqueue.c:__queue_work()
{

/* pwq which will be used unless @work is executing elsewhere */

...

/*
* If @work was previously on a different pool, it might still be
* running there, in which case the work needs to be queued on that
* pool to guarantee non-reentrancy.
* ...
*/

...
}

>From what I understand _queue_work() first checks if the work is currently
running (see find_worker_executing_work()) and if it is then it does not matter
if the new work is requested to run on a different CPU - it will be queued on the
same CPU as the currently executing work.

So it looks like that if the work is *not* currently executing then a delay of 0
would indeed queue the work to be executed at earliest possible on the requested/new
CPU. This is what the snippet you quote intends.

In above snippet mbm_setup_overflow_handler() is called with a 0 delay only if
cancel_delayed_work() returns "true". Per cancel_delayed_work() function comments:
/*
* ...
* Note:
* The work callback function may still be running on return, unless
* it returns %true and the work doesn't re-arm itself.
* ...
/

>From above I understand that the work is *not* currently running and the the
other planned change (the if (!is_percpu_thread()) check added to the worker) will
prevent the work from re-arming itself.

It thus looks to me as though calling mbm_setup_overflow_handler() with 0 delay is
ok here and will indeed result in work being queued onto new CPU's queue.
What do you think?

With this reasoning there may be a current issue since mbm_setup_overflow_handler()
is currently called with 0 delay irrespective of work currently executing or not?
Fortunately the work always re-schedules itself instead of staying put.

Reinette

> Link: https://lore.kernel.org/all/389bd92c-47ba-46af-81cb-9b669533b1fe@xxxxxxxxx/ [1]