Re: [PATCH resend] mm: compaction: optimize proactive compaction deferrals

From: Khalid Aziz
Date: Thu Jul 22 2021 - 00:01:57 EST


On 7/21/21 6:13 AM, Charan Teja Reddy wrote:
Vlastimil Babka figured out that when fragmentation score didn't go down
across the proactive compaction i.e. when no progress is made, next wake
up for proactive compaction is deferred for 1 <<
COMPACT_MAX_DEFER_SHIFT, i.e. 64 times, with each wakeup interval of
HPAGE_FRAG_CHECK_INTERVAL_MSEC(=500). In each of this wakeup, it just
decrement 'proactive_defer' counter and goes sleep i.e. it is getting
woken to just decrement a counter. The same deferral time can also
achieved by simply doing the HPAGE_FRAG_CHECK_INTERVAL_MSEC <<
COMPACT_MAX_DEFER_SHIFT thus unnecessary wakeup of kcompact thread is
avoided thus also removes the need of 'proactive_defer' thread counter.

Link: https://lore.kernel.org/linux-fsdevel/88abfdb6-2c13-b5a6-5b46-742d12d1c910@xxxxxxx/
Signed-off-by: Charan Teja Reddy <charante@xxxxxxxxxxxxxx>


Reviewed-by: Khalid Aziz <khalid.aziz@xxxxxxxxxx>


---
Changes in V1:
o Removed the 'proactive_defer' thread counter by optimizing proactive
o This is a resend as earlier it was clubbed with other changes posted
at https://lore.kernel.org/patchwork/patch/1448789/

mm/compaction.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 621508e..db00dbf 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2885,7 +2885,8 @@ static int kcompactd(void *p)
{
pg_data_t *pgdat = (pg_data_t *)p;
struct task_struct *tsk = current;
- unsigned int proactive_defer = 0;
+ long default_timeout = msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC);
+ long timeout = default_timeout;
const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id);
@@ -2902,23 +2903,30 @@ static int kcompactd(void *p)
trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
- kcompactd_work_requested(pgdat),
- msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
+ kcompactd_work_requested(pgdat), timeout)) {
psi_memstall_enter(&pflags);
kcompactd_do_work(pgdat);
psi_memstall_leave(&pflags);
+ /*
+ * Reset the timeout value. The defer timeout by
+ * proactive compaction can effectively lost
+ * here but that is fine as the condition of the
+ * zone changed substantionally and carrying on
+ * with the previous defer is not useful.
+ */
+ timeout = default_timeout;
continue;
}
- /* kcompactd wait timeout */
+ /*
+ * Start the proactive work with default timeout. Based
+ * on the fragmentation score, this timeout is updated.
+ */
+ timeout = default_timeout;
if (should_proactive_compact_node(pgdat)) {
unsigned int prev_score, score;
- if (proactive_defer) {
- proactive_defer--;
- continue;
- }
prev_score = fragmentation_score_node(pgdat);
proactive_compact_node(pgdat);
score = fragmentation_score_node(pgdat);
@@ -2926,8 +2934,9 @@ static int kcompactd(void *p)
* Defer proactive compaction if the fragmentation
* score did not go down i.e. no progress made.
*/
- proactive_defer = score < prev_score ?
- 0 : 1 << COMPACT_MAX_DEFER_SHIFT;
+ if (unlikely(score >= prev_score))
+ timeout =
+ default_timeout << COMPACT_MAX_DEFER_SHIFT;
}
}