RE: [PATCH] fs/resctrl: Fix use-after-free in resctrl_offline_mon_domain()

From: Luck, Tony

Date: Wed May 06 2026 - 16:53:15 EST


> > Question?
> >
> >> + if (!is_percpu_thread()) {
> >> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
> >> + if (d->mbm_work_cpu == nr_cpu_ids)
> >> + mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL, RESCTRL_PICK_ANY_CPU);
> >
> > Should that "MBM_OVERFLOW_INTERVAL" be "0"? This worker is presumably
> > already slightly late because of the offline CPU overhead and time to
> > be picked up by another CPU. Maybe it should run right away on whatever
> > new CPU in the domain is picked?
>
> The delay is intentionally _not_ zero and there should probably be a comment
> to make that clear. My module experiment demonstrated that when the work associated
> with the work_struct is already running then no matter which CPU is provided as parameter
> to schedule_delayed_work_on() the workqueue handling will schedule the work on the same
> CPU as the currently executing work. Second time around is_percpu_thread() will still be
> false but this time mbm_work_cpu will be set to CPU it should have been scheduled to and
> work will exit without re-arming the worker and the associated domain loses its worker.
>
> By setting the delay to MBM_OVERFLOW_INTERVAL it guarantees that the current executing
> worker will be done by the time the newly scheduled worker should run and thus
> be scheduled on correct CPU. I assume you are hinting that if the memory bandwidth is
> under pressure there may thus be a risk that an overflow occurred? Perhaps
> MBM_OVERFLOW_INTERVAL is too big - the delay only needs to be big enough to ensure that
> current worker is done before new worker is scheduled. Do you have suggestions?

I'd missed that "0" means "ignore cpu and run this right away". I'll keep it at MBM_OVERFLOW_INTERVAL
(and use CQM_LIMBOCHECK_INTERVAL for the cqm_limbo case).

While running late is not ideal, the MBM_OVERFLOW_INTERVAL was chosen in a conservative
way for older Intel systems with 24-bit MBM counters. A gap of twice that is unlikely to cause issues
on those systems. Modern (Icelake and newer) Intel systems have 32-bit counters and are
completely safe, I believe that AMD counters are also wide enough that a 2-second interval is
safe. Babu???

-Tony