Re: madvise(MADV_COLLAPSE) fails with EINVAL on dirty file-backed text pages

From: David Hildenbrand (Red Hat)

Date: Fri Nov 07 2025 - 04:12:06 EST




5. Yes, I'm calling madvise(MADV_COLLAPSE) on the text portion of the executable, using the address
range obtained from /proc/self/maps. IIUC, this should benefit applications by reducing ITLB pressure.

I agree with the suggestions to either Return EAGAIN instead of EINVAL or At minimum, document the
EINVAL return for dirty pages. I'm happy to work on a patch.

Of course, we could detect that we are in MADV_COLLAPSE and simply writeback ourselves. After all,
user space asked for a collapse, and it's not khugepaged that will simple revisit it later.

I did something similar in

commit ab73b29efd36f8916c6cc9954e912c4723c9a1b0
Author: David Hildenbrand <david@xxxxxxxxxx>
Date: Fri May 16 14:39:46 2025 +0200

s390/uv: Improve splitting of large folios that cannot be split while dirty
Currently, starting a PV VM on an iomap-based filesystem with large
folio support, such as XFS, will not work. We'll be stuck in
unpack_one()->gmap_make_secure(), because we can't seem to make progress
splitting the large folio.

Where I effectively use filemap_write_and_wait_range().

It could be used early to writeback the whole range to collapse once, possibly.