Re: [PATCHv2] mm: khugepaged: make scan loops suspend aware

From: David Hildenbrand (Arm)

Date: Thu Feb 12 2026 - 04:11:03 EST

On 2/12/26 10:05, Sergey Senozhatsky wrote:

On (26/02/12 09:44), David Hildenbrand (Arm) wrote:
[..]

If we're fixing an issue, we usually try to identify which commit introduced the
issue.

For example, support for freezing was introduced in

commit 878aee7d6b5504e01b9caffce080e792b6b8d090
Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Date: Thu Jan 13 15:47:10 2011 -0800

thp: freeze khugepaged and ksmd
It's unclear why schedule friendly kernel threads can't be taken away by
the CPU through the scheduler itself. It's safer to stop them as they can
trigger memory allocation, if kswapd also freezes itself to avoid
generating I/O they have too.

Now that I am looking through the history, I find:

commit b39ca208403c8f2c17dab1fbfef1f5ecaff25e53
Author: Kevin Hao <haokexin@xxxxxxxxx>
Date: Wed Dec 20 07:17:53 2023 +0800

mm/khugepaged: remove redundant try_to_freeze()
A freezable kernel thread can enter frozen state during freezing by either
calling try_to_freeze() or using wait_event_freezable() and its variants.
However, there is no need to use both methods simultaneously. The
freezable wait variants have been used in khugepaged_wait_work() and
khugepaged_alloc_sleep(), so remove this redundant try_to_freeze().
I used the following stress-ng command to generate some memory load on my
Intel Alder Lake board (24 CPUs, 32G memory).

I wonder if that made the issue more likely to appear?

Interestingly, we also had in the past:

commit 1dfb059b9438633b0546c5431538a47f6ed99028
Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Date: Thu Dec 8 14:33:57 2011 -0800

thp: reduce khugepaged freezing latency
khugepaged can sometimes cause suspend to fail, requiring that the user
retry the suspend operation.

So it's a recurring theme.

Interesting, so 1dfb059b9438633 and 878aee7d6b5504e fixed real
problems "khugepaged can sometimes cause suspend to fail", but
I don't see what exactly b39ca208403c8f2 fixed. Sounds more
like an "optimization"?

Yes, a cleanup. I wonder if it caused harm.

Given that we only scan "khugepaged_pages_to_scan" pages/ptes/etc. before going back to sleep,
I wonder how that can take in your setup that long.

Why does it end up taking something around 20 seconds in your setup?

I only have bug reports at hands, I don't have a repro. Can the fact
that swap reads require S/W decompression (zram) add enough latency?

I guess so. 20 seconds is still a lot.

How is khugepaged_pages_to_scan set in your environment?

Let me check.

cat /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
4096

Hmm, doesn't sound too high. Let me look more.

Yeah, that's not a lot of pages to scan. It's the default (8 * HPAGE_PMD_NR)

--
Cheers,

David