Re: [PATCH v7 06/11] filemap: cap PTE range to be created to allowed zero fill in folio_map_range()

From: David Hildenbrand
Date: Thu Jun 13 2024 - 04:07:42 EST


On 13.06.24 09:57, Luis Chamberlain wrote:
On Wed, Jun 12, 2024 at 08:08:15PM +0100, Matthew Wilcox wrote:
On Fri, Jun 07, 2024 at 02:58:57PM +0000, Pankaj Raghav (Samsung) wrote:
From: Pankaj Raghav <p.raghav@xxxxxxxxxxx>

Usually the page cache does not extend beyond the size of the inode,
therefore, no PTEs are created for folios that extend beyond the size.

But with LBS support, we might extend page cache beyond the size of the
inode as we need to guarantee folios of minimum order. Cap the PTE range
to be created for the page cache up to the max allowed zero-fill file
end, which is aligned to the PAGE_SIZE.

I think this is slightly misleading because we might well zero-fill
to the end of the folio. The issue is that we're supposed to SIGBUS
if userspace accesses pages which lie entirely beyond the end of this
file. Can you rephrase this?

(from mmap(2))
SIGBUS Attempted access to a page of the buffer that lies beyond the end
of the mapped file. For an explanation of the treatment of the
bytes in the page that corresponds to the end of a mapped file
that is not a multiple of the page size, see NOTES.


The code is good though.

Reviewed-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>

Since I've been curating the respective fstests test to test for this
POSIX corner case [0] I wanted to enable the test for tmpfs instead of
skipping it as I originally had it, and that meant also realizing mmap(2)
specifically says this now:

Huge page (Huge TLB) mappings

Confusion alert: this likely talks about hugetlb (MAP_HUGETLB), not THP and friends.

So it might not be required for below changes.

...
For mmap(), offset must be a multiple of the underlying huge page
size. The system automatically aligns length to be a multiple of
the underlying huge page size.

So do we need to adjust this patch with this:

diff --git a/mm/filemap.c b/mm/filemap.c
index ea78963f0956..9c8897ba90ff 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3617,6 +3617,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
vm_fault_t ret = 0;
unsigned long rss = 0;
unsigned int nr_pages = 0, mmap_miss = 0, mmap_miss_saved, folio_type;
+ unsigned int align = PAGE_SIZE;
rcu_read_lock();
folio = next_uptodate_folio(&xas, mapping, end_pgoff);
@@ -3636,7 +3637,10 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
goto out;
}
- file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
+ if (folio_test_pmd_mappable(folio))
+ align = 1 << folio_order(folio);
+
+ file_end = DIV_ROUND_UP(i_size_read(mapping->host), align) - 1;
if (end_pgoff > file_end)
end_pgoff = file_end;

[0] https://lore.kernel.org/all/20240611030203.1719072-3-mcgrof@xxxxxxxxxx/

Luis


--
Cheers,

David / dhildenb