Re: [PATCH] mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma

From: Yang Shi
Date: Mon Jan 13 2025 - 13:51:43 EST





On 1/10/25 7:54 PM, Liu Shixin wrote:

On 2025/1/11 3:40, Yang Shi wrote:


On 1/10/25 11:01 AM, Matthew Wilcox wrote:
On Fri, Jan 10, 2025 at 10:04:42AM -0800, Yang Shi wrote:
On 1/9/25 8:31 PM, Matthew Wilcox wrote:
On Thu, Jan 09, 2025 at 09:00:24AM -0800, Yang Shi wrote:
Thanks for catching this. It sounds a little bit weird to have vm_file for
an anonymous VMA. I'm not sure why we should keep such special case. It
seems shared mapping is treated as shmem file mapping. So can we set vm_file
to NULL when mmap'ing /dev/zero for private mapping? Something like:

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 169eed162a7f..fc332efc5c11 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -527,6 +527,7 @@ static int mmap_zero(struct file *file, struct
vm_area_struct *vma)
if (vma->vm_flags & VM_SHARED)
return shmem_zero_setup(vma);
vma_set_anonymous(vma);
+ vma->vm_file = NULL;
return 0;
}
I'm wary this might cause other bugs somewhere. rc6 is a bit late to be
introducing such a subtle change.
Thanks for the extra caution. Applying the proposed fix in khugepaged code
is fine to me either. We can try to kill the special case later.

Looking at the code further, I think we should do more to make private
/dev/zero mapping an anonymous mapping:
I'm still nervous about this. We map device inodes in a lot of places.
Yes, we do. But I don't think this change actually changes the semantic of /dev/zero. Shared /dev/zero mapping is still treated as shmem mapping, private /dev/zero mapping is treated as anonymous mapping, but the current implementation is actually half-baked. It has NULL vma->vm_ops which is used to tell kernel whether it is an anonymous vma or not, but it also has valid vma->vm_file and vma->vm_pgoff as in file offset.

So this special case makes kernel has 3 types of VMA:
- anonymous VMA: vm_ops is NULL, vm_file is NULL, vm_pgoff is the linear address pgoff
- file VMA: vm_ops is *NOT* NULL, valid vm_file and vm_pgoff is index in file
- private /dev/zero mapping VMA

I have posted v2 to fix it in a safe way. Link: https://lore.kernel.org/all/20250111034511.2223353-1-liushixin2@xxxxxxxxxx/

Maybe we can also revisit commit bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") and fix it by another way?

What do you mean about revisiting this commit? Reset vm_file and recalculate vm_pgoff in vma_set_anonymous()?


By the way, it seems we collpase the file even after cow for a private file mapping. Is that so?

It seems so.


Thanks,
.