Re: [PATCH 1/2] mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range
From: Wandun
Date: Sat May 09 2026 - 03:04:44 EST
On 5/8/26 23:02, Lorenzo Stoakes wrote:
On Fri, May 08, 2026 at 02:27:37PM +0200, David Hildenbrand (Arm) wrote:Yes, PMD-aligned address.
On 5/7/26 09:05, Chen Wandun wrote:I think both should return -EINVAL.
madvise_collapse() computes the THP-aligned window:
hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK /* round up */
hend = end & HPAGE_PMD_MASK /* round down */
Previously this was done after kmalloc_obj(), so problem arose when
the range contained no complete PMD-aligned window (hstart >= hend).
When hstart > hend, (hend - hstart) wraps unsigned to a huge value, the
final comparison fails and -EINVAL is returned instead of 0. Consider
What's aligned? You're putting a random variable name in there? Presumably a PMD-aligned address?two single-page calls on a 2 MiB-aligned address:
/* hstart == hend == aligned -> 0 == 0 -> returns 0 */
madvise(aligned, PAGE_SIZE, MADV_COLLAPSE);
Yes, I also thinks this patch only fixes minor issue or cosidered a clean-up.
Disagree./* hstart = aligned + 2MiB, hend = aligned
* (hend - hstart) wraps unsigned -> returns -EINVAL */
madvise(aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE);
Both calls cover less than one THP and collapse nothing; both should
return 0.
Okay, so we talk about a "userspace is being stupid" scenario.Yes!
I feel that -EINVAL is correct for hend > hstart, and I think it might even be a
userland A[BP]I break to change it (maybe somebody, somewhere is being foolish
enough to use this to also validate input ranges).
The weirdness is when hstart == hend being 0 but that's sort of established
behaviour I guess.
Yes, in general - so what? The user is doing stupid things, so the user winsIn addition, kmalloc_obj(), mmgrab() and lru_add_drain_all() were allJust a comment as you motivate here why this is suboptimal: we do not care about
called before discovering there was nothing to do, only for the code
to kfree() and return immediately after.
a "userspace is being stupid" scenario being fast.
stupid prizes?
I'm not sure I want a fixes here, this isn't really fixing anything. This isn'tFix both by computing hstart/hend after thp_vma_allowable_order() butFixes: is likely ok, but I don't think we want to treat this as a hotfix or CC
before kmalloc_obj(), and returning 0 early when hstart >= hend.
Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
stable.
a bug afaik, it's just us not handling this brilliantly, but (possibly by
mistake) getting the right output.
I would drop this Fixes tag in v2 to avoid any confusion.
I can confirm this patch is my own work,I found the issue and wrote this patch myself.
I put this patch through AI detection and it's telling me there's an 80% chanceSigned-off-by: Chen Wandun <chenwandun@xxxxxxxxxxx>
this whole thing is LLM-generated, which is making me grumpy.
Can you confirm that this is, in fact, your own work? Plagiarism is not a nice
thing to do, and THP doesn't need more traffic, we're overloaded as it is.
The issue was found when I noticed THP pages were still being generated even after
adding "transparent_hugepage=never" to the cmdline, after debugging, and finally found
this was due to madvise + collapse path, while reviewing code I found this minor
issue and wrote this patch.
I did use an LLM, but only to check the commit message to find spelling/grammar
errors and improve readability.
I fully understand your concern about the traffic, I will be more careful about
what I send to the list.
Apologies for not basing this on mm-unstable, I'll fix in v2.
See below re: conflict.---
mm/khugepaged.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b8452dbdb043..92473d93e837 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2836,6 +2836,12 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER))
return -EINVAL;
+ hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
+ hend = end & HPAGE_PMD_MASK;
if (hstart > hend)+
+ if (hstart >= hend)
+ return 0;
return -EINVAL;
/* For compatibility, users may rely on this. */
if (hstart == hend)
return 0;
Is probably better.
But I'm not sure what the point is if we're already doing this behaviour?
Please use mm-unstable as a basis for your mm work Chen, this is something you+In general, LGTM, but see for conflict:
cc = kmalloc_obj(*cc);
if (!cc)
return -ENOMEM;
@@ -2845,9 +2851,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
mmgrab(mm);
lru_add_drain_all();
- hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
- hend = end & HPAGE_PMD_MASK;
-
for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) {
enum scan_result result = SCAN_FAIL;
https://lore.kernel.org/all/20260409014323.2385982-1-ye.liu@xxxxxxxxx/
need to fix, the patch above has been around for a while and is in
mm-unstable.
You have patches in mm already so you should know better by now.
Thanks for your review.
Best regards,
Wandun
But I'm really not sure I'm in favour of this anyway. I'll defer to David but
this feels useless to me.
Thanks, Lorenzo
--
Cheers,
David