Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

From: David Hildenbrand
Date: Thu Jun 06 2024 - 03:18:46 EST


On 06.06.24 03:34, Barry Song wrote:
On Thu, Jun 6, 2024 at 2:42 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 05.06.24 16:08, Zi Yan wrote:
On 5 Jun 2024, at 2:54, ran xiaokai wrote:

On Tue, Jun 4, 2024 at 5:47?PM <xu.xin16@xxxxxxxxxx> wrote:

From: Ran Xiaokai <ran.xiaokai@xxxxxxxxxx>

When I did a large folios split test, a WARNING
"[ 5059.122759][ T166] Cannot split file folio to non-0 order"
was triggered. But my test cases are only for anonmous folios.
while mapping_large_folio_support() is only reasonable for page
cache folios.

In split_huge_page_to_list_to_order(), the folio passed to
mapping_large_folio_support() maybe anonmous folio. The
folio_test_anon() check is missing. So the split of the anonmous THP
is failed. This is also the same for shmem_mapping(). We'd better add
a check for both. But the shmem_mapping() in __split_huge_page() is
not involved, as for anonmous folios, the end parameter is set to -1, so
(head[i].index >= end) is always false. shmem_mapping() is not called.

Using /sys/kernel/debug/split_huge_pages to verify this, with this
patch, large anon THP is successfully split and the warning is ceased.

Signed-off-by: Ran Xiaokai <ran.xiaokai@xxxxxxxxxx>
Cc: xu xin <xu.xin16@xxxxxxxxxx>
Cc: Yang Yang <yang.yang29@xxxxxxxxxx>
---
mm/huge_memory.c | 38 ++++++++++++++++++++------------------
1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 317de2afd371..4c9c7e5ea20c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
if (new_order >= folio_order(folio))
return -EINVAL;

- /* Cannot split anonymous THP to order-1 */
- if (new_order == 1 && folio_test_anon(folio)) {
- VM_WARN_ONCE(1, "Cannot split to order-1 folio");
- return -EINVAL;
- }
-
if (new_order) {
/* Only swapping a whole PMD-mapped folio is supported */
if (folio_test_swapcache(folio))
return -EINVAL;
- /* Split shmem folio to non-zero order not supported */
- if (shmem_mapping(folio->mapping)) {
- VM_WARN_ONCE(1,
- "Cannot split shmem folio to non-0 order");
- return -EINVAL;
- }
- /* No split if the file system does not support large folio */
- if (!mapping_large_folio_support(folio->mapping)) {
- VM_WARN_ONCE(1,
- "Cannot split file folio to non-0 order");
- return -EINVAL;
+
+ if (folio_test_anon(folio)) {
+ /* Cannot split anonymous THP to order-1 */
+ if (new_order == 1) {
+ VM_WARN_ONCE(1, "Cannot split to order-1 folio");
+ return -EINVAL;
+ }
+ } else {
+ /* Split shmem folio to non-zero order not supported */
+ if (shmem_mapping(folio->mapping)) {
+ VM_WARN_ONCE(1,
+ "Cannot split shmem folio to non-0 order");
+ return -EINVAL;
+ }
+ /* No split if the file system does not support large folio */
+ if (!mapping_large_folio_support(folio->mapping)) {
+ VM_WARN_ONCE(1,
+ "Cannot split file folio to non-0 order");
+ return -EINVAL;
+ }

Am I missing something? if file system doesn't support large folio,
how could the large folio start to exist from the first place while its
mapping points to a file which doesn't support large folio?

I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
khugepaged will try to collapse read-only file-backed pages to 2M THP.

Can you add this information to the commit log in your next version?

Can we also add that as a comment to that function, like "Note that we
might still
have THPs in such mappings due to CONFIG_READ_ONLY_THP_FOR_FS. But in
that case,
the mapping does not actually support large folios properly.
"Or sth. like that.

+1. Otherwise, the code appears quite confusing.
Would using "#ifdef" help to further clarify it?

#ifdef CONFIG_READ_ONLY_THP_FOR_FS
if (!mapping_large_folio_support(folio->mapping)) {
....
}

if (IS_ENABLED()) ...

Agreed.

--
Cheers,

David / dhildenb