Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled

From: Yang Shi
Date: Fri May 02 2025 - 16:53:34 EST




On 5/2/25 5:46 AM, Lorenzo Stoakes wrote:
+cc Andrew.

Ignacio, you should always include Andrew in patch submissions to mm :)

+cc Yang Shi who added this in the first place in commit c4608d1bf7c6 ("mm:
mmap: map MAP_STACK to VM_NOHUGEPAGE").

Thanks for cc'ing me.


On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote:
From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@xxxxxxxx>

commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps
the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if
CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the
VM_NOHUGEPAGE does not make sense. For instance, when calling madvise()
with MADV_NOHUGEPAGE, an error is always returned.

Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@xxxxxxxx>
I don't see how can this cause a problem, and it fixes one in practice, so
LGTM. Though see note below about CRIU :)

I also added a nit below, if you address this you can re-use my tag.

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>

Yeah, I agree. Looks good to me too. Reviewed-by: Yang Shi <yang@xxxxxxxxxxxxxxxxxxxxxx>


Thanks!

Do we want to back port this to stable kernels? If so we should have a:

Fixes: c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE")
cc: stable@xxxxxxxxxxxxxxx

Appended here, and Greg's scripts should automagically backport, assuming
no conflicts or such (I don't _think_ there would be...)

---
I discovered this issue when trying to use the tool CRIU to checkpoint
and restore a container. Our running kernel is compiled without
CONFIG_TRANSPARENT_HUGETABLES. CRIU parses the output of
/proc/<pid>/smaps and saves the "nh" flag. When trying to restore the
container, CRIU fails to restore the "nh" mappings, since madvise()
MADV_NOHUGEPAGE always returns an error because
CONFIG_TRANSPARENT_HUGETABLES is not defined.
Yeah this is really not a stable or valid use of the /proc/$pid/[s]maps
interface :P CRIU is sort of a blurry line of relying on internal
implementation details so we're kinda not obligated to prevent breakages.

CRIU is kinda relying on internal implementation details so debatable as to
whether we should be bending over backwards to support.

BUT, we also don't want to cause unwanted issues if there's a simple fix
and this seems reasonable to me.

---
include/linux/mman.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index bce214fece16b9af3791a2baaecd6063d0481938..1e83bc0e3db670b04743f5208826e87455a05325 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -155,7 +155,9 @@ calc_vm_flag_bits(struct file *file, unsigned long flags)
return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
_calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
_calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE)
NIT, but can we use ifdef here for consistency? Thanks.

_calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
+#endif
arch_calc_vm_flag_bits(file, flags);
}


---
base-commit: fc96b232f8e7c0a6c282f47726b2ff6a5fb341d2
change-id: 20250428-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-ce40a1de095d

Best regards,
--
Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@xxxxxxxx>