Re: [PATCH linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

From: Barry Song
Date: Wed Jun 05 2024 - 21:34:37 EST


On Thu, Jun 6, 2024 at 2:42 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 05.06.24 16:08, Zi Yan wrote:
> > On 5 Jun 2024, at 2:54, ran xiaokai wrote:
> >
> >>> On Tue, Jun 4, 2024 at 5:47?PM <xu.xin16@xxxxxxxxxx> wrote:
> >>>>
> >>>> From: Ran Xiaokai <ran.xiaokai@xxxxxxxxxx>
> >>>>
> >>>> When I did a large folios split test, a WARNING
> >>>> "[ 5059.122759][ T166] Cannot split file folio to non-0 order"
> >>>> was triggered. But my test cases are only for anonmous folios.
> >>>> while mapping_large_folio_support() is only reasonable for page
> >>>> cache folios.
> >>>>
> >>>> In split_huge_page_to_list_to_order(), the folio passed to
> >>>> mapping_large_folio_support() maybe anonmous folio. The
> >>>> folio_test_anon() check is missing. So the split of the anonmous THP
> >>>> is failed. This is also the same for shmem_mapping(). We'd better add
> >>>> a check for both. But the shmem_mapping() in __split_huge_page() is
> >>>> not involved, as for anonmous folios, the end parameter is set to -1, so
> >>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
> >>>>
> >>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> >>>> patch, large anon THP is successfully split and the warning is ceased.
> >>>>
> >>>> Signed-off-by: Ran Xiaokai <ran.xiaokai@xxxxxxxxxx>
> >>>> Cc: xu xin <xu.xin16@xxxxxxxxxx>
> >>>> Cc: Yang Yang <yang.yang29@xxxxxxxxxx>
> >>>> ---
> >>>> mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> >>>> 1 file changed, 20 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 317de2afd371..4c9c7e5ea20c 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>> if (new_order >= folio_order(folio))
> >>>> return -EINVAL;
> >>>>
> >>>> - /* Cannot split anonymous THP to order-1 */
> >>>> - if (new_order == 1 && folio_test_anon(folio)) {
> >>>> - VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >>>> - return -EINVAL;
> >>>> - }
> >>>> -
> >>>> if (new_order) {
> >>>> /* Only swapping a whole PMD-mapped folio is supported */
> >>>> if (folio_test_swapcache(folio))
> >>>> return -EINVAL;
> >>>> - /* Split shmem folio to non-zero order not supported */
> >>>> - if (shmem_mapping(folio->mapping)) {
> >>>> - VM_WARN_ONCE(1,
> >>>> - "Cannot split shmem folio to non-0 order");
> >>>> - return -EINVAL;
> >>>> - }
> >>>> - /* No split if the file system does not support large folio */
> >>>> - if (!mapping_large_folio_support(folio->mapping)) {
> >>>> - VM_WARN_ONCE(1,
> >>>> - "Cannot split file folio to non-0 order");
> >>>> - return -EINVAL;
> >>>> +
> >>>> + if (folio_test_anon(folio)) {
> >>>> + /* Cannot split anonymous THP to order-1 */
> >>>> + if (new_order == 1) {
> >>>> + VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >>>> + return -EINVAL;
> >>>> + }
> >>>> + } else {
> >>>> + /* Split shmem folio to non-zero order not supported */
> >>>> + if (shmem_mapping(folio->mapping)) {
> >>>> + VM_WARN_ONCE(1,
> >>>> + "Cannot split shmem folio to non-0 order");
> >>>> + return -EINVAL;
> >>>> + }
> >>>> + /* No split if the file system does not support large folio */
> >>>> + if (!mapping_large_folio_support(folio->mapping)) {
> >>>> + VM_WARN_ONCE(1,
> >>>> + "Cannot split file folio to non-0 order");
> >>>> + return -EINVAL;
> >>>> + }
> >>>
> >>> Am I missing something? if file system doesn't support large folio,
> >>> how could the large folio start to exist from the first place while its
> >>> mapping points to a file which doesn't support large folio?
> >>
> >> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
> >> khugepaged will try to collapse read-only file-backed pages to 2M THP.
> >
> > Can you add this information to the commit log in your next version?
>
> Can we also add that as a comment to that function, like "Note that we
> might still
> have THPs in such mappings due to CONFIG_READ_ONLY_THP_FOR_FS. But in
> that case,
> the mapping does not actually support large folios properly.
> "Or sth. like that.

+1. Otherwise, the code appears quite confusing.
Would using "#ifdef" help to further clarify it?

#ifdef CONFIG_READ_ONLY_THP_FOR_FS
if (!mapping_large_folio_support(folio->mapping)) {
....
}
#endif

>
> --
> Cheers,
>
> David / dhildenb
>

Thanks
Barry