Re: [RFC PATCH] sched/fair: dynamically scale the period of cache work
From: Chen, Yu C
Date: Mon Apr 13 2026 - 04:41:05 EST
Hi Jianyong,
On 4/13/2026 3:23 PM, Jianyong Wu wrote:
When a preferred LLC is selected and remains stable, task_cache_work does
not need to run frequently. Because it scans all system CPUs for
computation, high-frequency execution hurts performance. We thus reduce
the scan rate in such cases.
On the other hand, if the preferred node becomes suboptimal, we should
increase the scan frequency to quickly find a better placement. The scan
period is therefore dynamically adjusted.
Signed-off-by: Jianyong Wu <wujianyong@xxxxxxxx>
---
Hi ChenYu, Tim, Gengkun,
I have another approach to address this issue, based on the observation
that the scan work can be canceled if the preferred node is stable.This
patch merely demonstrates the idea, but still needs more testing to
verify its functionality. I'm sending it out early to gather feedback and
opinions.
Thanks for providing this patch.
if (work->next == work) {
@@ -1728,7 +1734,7 @@ static void task_cache_work(struct callback_head *work)
struct task_struct *p = current, *cur;
unsigned long curr_m_a_occ = 0;
struct mm_struct *mm = p->mm;
- unsigned long m_a_occ = 0;
+ unsigned long m_a_occ = 0, need_scan = 0, now;
cpumask_var_t cpus;
u64 t0, scan_cost;
@@ -1753,6 +1759,12 @@ static void task_cache_work(struct callback_head *work)
t0 = sched_clock_cpu(curr_cpu);
+ now = jiffies;
+ if (time_before(now, READ_ONCE(mm->sc_stat.next_scan)))
+ return;
+
I agree that limiting the scan rate would be useful,
and your above change is actually similar to what NUMA balancing
did in task_numa_work(). It allows only one thread within the
same process to perform the statistics calculation, which avoids
redundant computation.
+ WRITE_ONCE(mm->sc_stat.next_scan, (now + mm->sc_stat.scan_period));
+
I suppose the above should be try_cmpxchg()?
That is to say, with your above change, we have already limited the scan
ratio for multi-threaded processes significantly. There appears to be no
need to perform adaptive adjustment of scan_period - the benefit of introducing
an adaptive scan_period may not offset the overhead of frequent writing to the
"global" mm->sc_stat.scan_period due to c2c?
thanks,
Chenyu