Re: [PATCHv2] mm: khugepaged: make scan loops suspend aware

From: Sergey Senozhatsky

Date: Thu Feb 12 2026 - 04:06:25 EST


On (26/02/12 09:44), David Hildenbrand (Arm) wrote:
[..]
> If we're fixing an issue, we usually try to identify which commit introduced the
> issue.
>
> For example, support for freezing was introduced in
>
> commit 878aee7d6b5504e01b9caffce080e792b6b8d090
> Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Date: Thu Jan 13 15:47:10 2011 -0800
>
> thp: freeze khugepaged and ksmd
> It's unclear why schedule friendly kernel threads can't be taken away by
> the CPU through the scheduler itself. It's safer to stop them as they can
> trigger memory allocation, if kswapd also freezes itself to avoid
> generating I/O they have too.
>
>
>
> Now that I am looking through the history, I find:
>
> commit b39ca208403c8f2c17dab1fbfef1f5ecaff25e53
> Author: Kevin Hao <haokexin@xxxxxxxxx>
> Date: Wed Dec 20 07:17:53 2023 +0800
>
> mm/khugepaged: remove redundant try_to_freeze()
> A freezable kernel thread can enter frozen state during freezing by either
> calling try_to_freeze() or using wait_event_freezable() and its variants.
> However, there is no need to use both methods simultaneously. The
> freezable wait variants have been used in khugepaged_wait_work() and
> khugepaged_alloc_sleep(), so remove this redundant try_to_freeze().
> I used the following stress-ng command to generate some memory load on my
> Intel Alder Lake board (24 CPUs, 32G memory).
>
>
> I wonder if that made the issue more likely to appear?
>
>
> Interestingly, we also had in the past:
>
> commit 1dfb059b9438633b0546c5431538a47f6ed99028
> Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Date: Thu Dec 8 14:33:57 2011 -0800
>
> thp: reduce khugepaged freezing latency
> khugepaged can sometimes cause suspend to fail, requiring that the user
> retry the suspend operation.
>
>
> So it's a recurring theme.

Interesting, so 1dfb059b9438633 and 878aee7d6b5504e fixed real
problems "khugepaged can sometimes cause suspend to fail", but
I don't see what exactly b39ca208403c8f2 fixed. Sounds more
like an "optimization"?

> Given that we only scan "khugepaged_pages_to_scan" pages/ptes/etc. before going back to sleep,
> I wonder how that can take in your setup that long.
>
> Why does it end up taking something around 20 seconds in your setup?

I only have bug reports at hands, I don't have a repro. Can the fact
that swap reads require S/W decompression (zram) add enough latency?

> How is khugepaged_pages_to_scan set in your environment?

Let me check.

cat /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
4096

Hmm, doesn't sound too high. Let me look more.