On Fri, Jul 07, 2023 at 01:29:02PM +0200, David Hildenbrand wrote:
On 07.07.23 11:52, Ryan Roberts wrote:
On 07/07/2023 09:01, Huang, Ying wrote:
Although we can use smaller page order for FLEXIBLE_THP, it's hard to
avoid internal fragmentation completely. So, I think that finally we
will need to provide a mechanism for the users to opt out, e.g.,
something like "always madvise never" via
/sys/kernel/mm/transparent_hugepage/enabled. I'm not sure whether it's
a good idea to reuse the existing interface of THP.
I wouldn't want to tie this to the existing interface, simply because that
implies that we would want to follow the "always" and "madvise" advice too; That
means that on a thp=madvise system (which is certainly the case for android and
other client systems) we would have to disable large anon folios for VMAs that
haven't explicitly opted in. That breaks the intention that this should be an
invisible performance boost. I think it's important to set the policy for use of
It will never ever be a completely invisible performance boost, just like
ordinary THP.
Using the exact same existing toggle is the right thing to do. If someone
specify "never" or "madvise", then do exactly that.
It might make sense to have more modes or additional toggles, but
"madvise=never" means no memory waste.
I hate the existing mechanisms. They are an abdication of our
responsibility, and an attempt to blame the user (be it the sysadmin
or the programmer) of our code for using it wrongly. We should not
replicate this mistake.
Our code should be auto-tuning. I posted a long, detailed outline here:
https://lore.kernel.org/linux-mm/Y%2FU8bQd15aUO97vS@xxxxxxxxxxxxxxxxxxxx/
I remember I raised it already in the past, but you *absolutely* have to
respect the MADV_NOHUGEPAGE flag. There is user space out there (for
example, userfaultfd) that doesn't want the kernel to populate any
additional page tables. So if you have to respect that already, then also
respect MADV_HUGEPAGE, simple.
Possibly having uffd enabled on a VMA should disable using large folios,
I can get behind that. But the notion that userspace knows what it's
doing ... hahaha. Just ignore the madvise flags. Userspace doesn't
know what it's doing.