Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled

From: David Hildenbrand
Date: Wed Jun 25 2025 - 03:36:04 EST



We would all prefer a less messy world of THP tunables. I certainly
find plenty to dislike there too; and wish that a less assertive name
than "never" had been chosen originally for the default off position.

But please don't break the accepted and documented behaviour of
MADV_COLLAPSE now.

Again see above, I absolutely disagree this is documented _clearly_. And
that's the underlying issue here.
> > I feel like if you polled 100 system administrators (assuming they knew
about THP) as to how you globally disable THP, probably all 100 would say
you do it via:

# echo never > /sys/kernel/mm/transparent_hugepage/enabled


Yes. One big problem is that the documentation was not updated.

Changing the meaning of "entirely disabled" to "entirely disabled automatically (page faults, khugepaged)"

So shouldn't 'never break userspace' be based on practical reality rather
than a theorised interpretation of documents that sadly are not clear
enough?

I think the problem is that there might indeed be more users out there relying on "never+MADV_COLLPASE" to now place THPs than "never+MADV_COLLPASE" to no place THPs.

What is the harm when not placing THPs? Performance degradation for some apps?

What is the harm when placing THPs although disabled? Making the life for David for debugging customer issues harder :P



If you want to exclude all possibility of THPs, then please use the
prctl(PR_SET_THP_DISABLE); or shmem_enabled=deny (I think it was me
who insisted that be respected by MADV_COLLAPSE back then).

While it's useful to have this, prctl() is where APIs go to die. It's a
hidden wasteland that nobody knows about, it may as well not exist.

We have a whole sysctl directory for configuring this stuff. It's sort of
crazy to have that then to have a special prctl() hidden away also...

prctl(PR_SET_THP_DISABLE) is today not really a reasonable way for an admin to disable THPs system wide I'm afraid.



Add a "deny" option to /sys/kernel/mm/transparent_hugepage/enabled
if you like. (But in these days of filesystem large folios, adding
new protections against them seems a few years late.)

Based on a reasonable interpretation of 'never' I would say we retain this
series as-is, and 'deny' could be what 'never' was intended to be before.

At least for shmem_enabled never means "unless overwritten per mount through huge=" and "deny" means "force off for all mounts". So "deny" is "harder"

Inverting the semantics here might be causing even more problems.

--
Cheers,

David / dhildenb