Re: [RFC PATCH] sched/fair: dynamically scale the period of cache work
From: Chen, Yu C
Date: Tue Apr 14 2026 - 23:31:31 EST
Hi Jianyong,
On 4/13/2026 7:27 PM, Jianyong Wu wrote:
Hi Chenyu,
[ ... ]
+ WRITE_ONCE(mm->sc_stat.next_scan, (now +mm->sc_stat.scan_period));
+
I suppose the above should be try_cmpxchg()?
Even though the update is not observed by others, it may not be a big problem.
However, using try_cmpxchg may incur more overhead than WRITE_ONCE.
So, I wonder if we can tolerate such a loss of "correctness" for the sake of performance.
try_cmpxchg is triggered not very frequently(10 ms) so the overhead might not be
that high? try_cmpxchg "strictly" avoid two threads entering the same loop,
and it seems that in the end of task_cache_work() there is a update_avg_scale()
which involves
u64 *avg += xxx
which is not atomic so maybe try_cmpxchg could help with that?
That is to say, with your above change, we have already limited the scan
ratio for multi-threaded processes significantly. There appears to be no
need to perform adaptive adjustment of scan_period - the benefit of
introducing an adaptive scan_period may not offset the overhead of
frequent writing to the "global" mm->sc_stat.scan_period due to c2c?
If we can increase the scan period, the operations inside the scan work will
not be executed frequently. Thus, there is little overhead from writing to the global variable.
This only occurs when the preferred node is unstable and the scan work runs
frequently. If the preferred node remains stable most of the time, we can still benefit from it.
I see. BTW, why mm->sc_stat.need_scan is needed? With sc_stat.next_scan
and sc_stat.scan_period, we should be able to adjust the timeout.
Is it because you want other condition to shrink the scan_period?
Like below:
+ if (to_pref && ret == mig_forbid)
+ mm->sc_stat.need_scan = 1;
thanks,
Chenyu