Re: [PATCHv2] mm: khugepaged: make scan loops suspend aware

From: David Hildenbrand (Arm)

Date: Mon Feb 16 2026 - 04:25:04 EST


On 2/14/26 07:35, Lance Yang wrote:


On 2026/2/12 17:10, David Hildenbrand (Arm) wrote:
On 2/12/26 10:05, Sergey Senozhatsky wrote:
[..]

Interesting, so 1dfb059b9438633 and 878aee7d6b5504e fixed real
problems "khugepaged can sometimes cause suspend to fail", but
I don't see what exactly b39ca208403c8f2 fixed.  Sounds more
like an "optimization"?

Yes, a cleanup. I wonder if it caused harm.



I only have bug reports at hands, I don't have a repro.  Can the fact
that swap reads require S/W decompression (zram) add enough latency?

I guess so. 20 seconds is still a lot.



Let me check.

cat /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
4096

Hmm, doesn't sound too high.  Let me look more.

Yeah, that's not a lot of pages to scan. It's the default (8 * HPAGE_PMD_NR)

Right. 4096 pages is not much to scan :)

This patch lets khugepaged be frozen between VMAs.

But if khugepaged is already collapsing when freeze starts, there
are two places without freeze checks that could take a bit long:

- __collapse_huge_page_swapin() loops 512 pages, calls do_swap_page()
  for each swap entry.

- collapse_file() loops 512 pages, calls shmem_get_folio(). If pages
  are swapped out, shmem_swapin_folio() is called.

Each swap-in can block for I/O. With multiple pages swapped out, the
cumulative time adds up.

But 20 seconds to swap in 4096 pages (16 MiB)?

Okay, on arm64 with 64k it would be a lot more (8 * 512 MiB == 4 GiB).
With With 16k we're at 8 * 32MiB = 256 MiB.


Maybe we also need check points inside these loops to bail out early?

I'd only do that if we have evidence that it's actually helpful.

@Sergey, with which base page size are you running (4k vs. 16k vs 64k)? I assume your report is on aarch64, correct?

--
Cheers,

David