Re: [PATCH v5 8/8] mm: huge_memory: enable debugfs to split huge pages to any order.

From: Aishwarya TCV
Date: Mon Mar 04 2024 - 10:48:24 EST




On 04/03/2024 14:58, Zi Yan wrote:
> On 4 Mar 2024, at 4:50, Aishwarya TCV wrote:
>
>> On 01/03/2024 21:10, Zi Yan wrote:
>>> On 1 Mar 2024, at 15:02, Zi Yan wrote:
>>>
>>>> On 1 Mar 2024, at 14:37, Zi Yan wrote:
>>>>
>>>>> On 1 Mar 2024, at 4:51, Aishwarya TCV wrote:
>>>>>
>>>>>> On 26/02/2024 20:55, Zi Yan wrote:
>>>>>>> From: Zi Yan <ziy@xxxxxxxxxx>
>>>>>>>
>>>>>>> It is used to test split_huge_page_to_list_to_order for pagecache THPs.
>>>>>>> Also add test cases for split_huge_page_to_list_to_order via both
>>>>>>> debugfs.
>>>>>>>
>>>>>>> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
>>>>>>> ---
>>>>>>> mm/huge_memory.c | 34 ++++--
>>>>>>> .../selftests/mm/split_huge_page_test.c | 115 +++++++++++++++++-
>>>>>>> 2 files changed, 131 insertions(+), 18 deletions(-)
>>>>>>>
>>>>>>
>>>>>> Hi Zi,
>>>>>>
>>>>>> When booting the kernel against next-master(20240228)with Arm64 on
>>>>>> Marvell Thunder X2 (TX2), the kselftest-mm test 'split_huge_page_test'
>>>>>> is failing in our CI (with rootfs over NFS). I can send the full logs if
>>>>>> required.
>>>>>>
>>>>>> A bisect (full log below) identified this patch as introducing the
>>>>>> failure. Bisected it on the tag "next-20240228" at repo
>>>>>> "https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git";.
>>>>>>
>>>>>> This works fine on Linux version 6.8.0-rc6
>>>>>
>>>>> Hi Aishwarya,
>>>>>
>>>>> Can you try the attached patch and see if it fixes the failure? I changed
>>>>> the test to accept XFS dev as input, mount XFS on a temp folder under /tmp,
>>>>> and skip if no XFS is mounted.
>>>>
>>>> Please try this updated one. It allows you to specify a XFS device path
>>>> in SPLIT_HUGE_PAGE_TEST_XFS_PATH env variable, which is passed to
>>>> split_huge_page_test in run_vmtests.sh. It at least allow CI/CD to run
>>>> the test without too much change.
>>>
>>> OK. This hopefully will be my last churn. Now split_huge_page_test accepts
>>> a path that is backed by XFS and run_vmtest.sh creates a XFS image in /tmp,
>>> mounts it in /tmp, and gives the path to split_huge_page_test. I tested
>>> it locally and it works. Let me know if you have any issue. Thanks.
>>>
>>> --
>>> Best Regards,
>>> Yan, Zi
>>
>> Hi Zi,
>>
>> Tested the patch by applying it on next-20240304. Logs from our CI with
>> rootfs over nfs is attached below. "Bail out! cannot remove tmp dir:
>> Directory not empty" is still observed.
>
> Hi Aishwarya,
>
> Do you have the config file for the CI kernel? And /tmp is also on nfs?
> Any detailed information about CI machine environment? I cannot reproduce
> the error locally, either on bare metal or VM. Maybe because my /tmp is
> not NFS mounted?
>

Hi Zi,

Please find the details below. Hope it helps.

Do you have the config file for the CI kernel?
- We are using:
defconfig+https://github.com/torvalds/linux/blob/master/tools/testing/selftests/mm/config

And /tmp is also on nfs?
- Yes

Any detailed information about CI machine environment?
- We are running the test using LAVA device Cavium Thunder X2 (TX2),
- We have very similar rootfs as - nfsrootfs:
https://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/20240129.0/arm64/full.rootfs.tar.xz
- We are using grub boot method over nfs
- Additionally Ryan mentioned "Looks like it is failing because he is
trying to delete the temp dir with rmdir() but rmdir() requires the
directory to be empty, which it is not."

Thanks,
Aishwarya

>>
>>
>> Test run log:
>> # # ------------------------------
>> # # running ./split_huge_page_test
>> # # ------------------------------
>> # # TAP version 13
>> # # 1..12
>> # # ok 1 Split huge pages successful
>> # # ok 2 Split PTE-mapped huge pages successful
>> # # # Please enable pr_debug in split_huge_pages_in_file() for more info.
>> # # # Please check dmesg for more information
>> # # ok 3 File-backed THP split test done
>> <6>[ 639.821657] split_huge_page (111099): drop_caches: 3
>> <6>[ 639.821657] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 4 # SKIP Pagecache folio split skipped
>> <6>[ 645.392184] split_huge_page (111099): drop_caches: 3
>> <6>[ 645.392184] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 5 # SKIP Pagecache folio split skipped
>> <6>[ 650.938248] split_huge_page (111099): drop_caches: 3
>> <6>[ 650.938248] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 6 # SKIP Pagecache folio split skipped
>> <6>[ 656.500149] split_huge_page (111099): drop_caches: 3
>> <6>[ 656.500149] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 7 # SKIP Pagecache folio split skipped
>> <6>[ 662.044085] split_huge_page (111099): drop_caches: 3
>> <6>[ 662.044085] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 8 # SKIP Pagecache folio split skipped
>> <6>[ 667.591841] split_huge_page (111099): drop_caches: 3
>> <6>[ 667.591841] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 9 # SKIP Pagecache folio split skipped
>> <6>[ 673.172441] split_huge_page (111099): drop_caches: 3
>> <6>[ 673.172441] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 10 # SKIP Pagecache folio split skipped
>> <6>[ 678.726263] split_huge_page (111099): drop_caches: 3
>> <6>[ 678.726263] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 11 # SKIP Pagecache folio split skipped
>> <6>[ 684.272851] split_huge_page (111099): drop_caches: 3
>> <6>[ 684.272851] split_huge_page (111099): drop_caches: 3
>> # # # No large pagecache folio generated, please provide a filesystem
>> supporting large folio
>> # # ok 12 # SKIP Pagecache folio split skipped
>> # # Bail out! cannot remove tmp dir: Directory not empty
>> # # # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:9 error:0
>> # # [FAIL]
>> # not ok 51 split_huge_page_test # exit=1
>> # # ------------------
>>
>> Thanks,
>> Aishwarya
>
>
> --
> Best Regards,
> Yan, Zi