Re: [stable-6.6.y] mm: khugepaged refuses to freeze

From: Sergey Senozhatsky

Date: Thu Feb 05 2026 - 23:35:35 EST


On (26/02/06 12:38), Sergey Senozhatsky wrote:
[..]
> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > index eff9e3061925..fa6a018b20a8 100644
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -1894,6 +1894,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
> > > xas_set(&xas, index);
> > > folio = xas_load(&xas);
> > > + if (try_to_freeze())
> > > + goto xa_unlocked;
> > > +
> > > VM_BUG_ON(index != xas.xa_index);
> > > if (is_shmem) {
> > > if (!folio) {
> >
> > Your analysis is reasonable. When the system is freezing, khugepaged is
> > still trying to swap-in shmem to collapse, which prevents the system from
> > entering suspend state. However, it’s not only shmem that will swap in,
> > collapsing anonymous folios may also trigger swap-in operations.
>
> Right, I thought about it but wasn't sure. Could the inner loop (e.g.
> collapse_file() in this particular case) loop long enough to fail suspend
> w/o ever giving the outer loop (khugepaged_do_scan()) a chance to freeze?

For inner loops I wondered if cond_resched() could be an indicator of
where try_to_freeze() should be placed. Those cond_resched() calls
are there for a reason, after all. E.g. something like:

---

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index fa6a018b20a8..cee08466a069 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2431,6 +2431,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, enum scan_result
unsigned long hstart, hend;

cond_resched();
+ if (try_to_freeze())
+ break;
+
if (unlikely(hpage_collapse_test_exit_or_disable(mm))) {
progress++;
break;
@@ -2453,6 +2456,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, enum scan_result
bool mmap_locked = true;

cond_resched();
+ if (try_to_freeze())
+ goto breakouterloop;
+
if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
goto breakouterloop;