We find that a warn will be produced during our test, the detail log is
shown in the end.
The core problem of this warn is that the first pfn of this pfnmap vma is
cleared during memory-failure. Digging into the source we find that this
problem can be triggered as following:
// mmap with MAP_PRIVATE and specific fd which hook mmap
mmap(MAP_PRIVATE, fd)
__mmap_region
remap_pfn_range
// set vma with pfnmap and the prot of pte is read only
// memset this memory with trigger fault
handle_mm_fault
__handle_mm_fault
handle_pte_fault
// write fault and !pte_write(entry)
do_wp_page
wp_page_copy // this will alloc a new page with valid page struct
// for this pfnmap vma
// inject a hwpoison to the first page of this vma
madvise_inject_error
memory_failure
hwpoison_user_mappings
try_to_unmap_one
// mark this pte as invalid (hwpoison)
mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
address, range.end);
// during unmap vma, the first pfn of this pfnmap vma is invalid
vm_mmap_pgoff
do_mmap
__do_mmap_mm
__mmap_region
__do_munmap
unmap_region
unmap_vmas
unmap_single_vma
untrack_pfn
follow_phys // pte is already invalidate, WARN_ON here
CoW with a valid page for pfnmap vma is weird to us. Can we use
remap_pfn_range for private vma(read only)? Once CoW happens on a pfnmap
vma during write fault, this page is normal(page flag is valid) for most mm
subsystems, such as memory failure in thais case and extra should be done to
handle this special page.
During unmap, if this vma is pfnmap, unmap shouldn't be done since page
should not be touched for pfnmap vma.
But the root problem is that can we insert a valid page for pfnmap vma?
Any thoughts to solve this warn?