Re: [PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults

From: kernel test robot
Date: Wed Apr 06 2022 - 12:07:32 EST

Next message: Seth Heasley: "Re: [PATCH 05/11] i2c: ismt: Fix undefined behavior due to shift overflowing the constant"
Previous message: Zhang, Qiang1: "RE: [PATCH] rcu: Use IRQ_WORK_INIT_HARD() to initialize defer_qs_iw on PREEMPT_RT kernel"
In reply to: Peter Xu: "[PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults"
Next in thread: Peter Xu: "Re: [PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Peter,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on hnaz-mm/master]
[cannot apply to arnd-asm-generic/master linus/master linux/master v5.18-rc1 next-20220406]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/userfaultfd-wp-Support-shmem-and-hugetlbfs/20220405-100136
base: https://github.com/hnaz/linux-mm master
config: s390-randconfig-r044-20220406 (https://download.01.org/0day-ci/archive/20220406/202204062154.2txNJyaf-lkp@xxxxxxxxx/config)
compiler: s390-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/e7e7aaec811e2817cd169f0cc1d8f81bdf1f05c3
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Peter-Xu/userfaultfd-wp-Support-shmem-and-hugetlbfs/20220405-100136
git checkout e7e7aaec811e2817cd169f0cc1d8f81bdf1f05c3
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=s390 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@xxxxxxxxx>

All errors (new ones prefixed by >>):

mm/hugetlb.c: In function 'hugetlb_fault':
>> mm/hugetlb.c:5678:13: error: implicit declaration of function 'huge_pte_none_mostly'; did you mean 'pte_none_mostly'? [-Werror=implicit-function-declaration]
5678 | if (huge_pte_none_mostly(entry)) {
| ^~~~~~~~~~~~~~~~~~~~
| pte_none_mostly
cc1: some warnings being treated as errors

vim +5678 mm/hugetlb.c

5616
5617 vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
5618 unsigned long address, unsigned int flags)
5619 {
5620 pte_t *ptep, entry;
5621 spinlock_t *ptl;
5622 vm_fault_t ret;
5623 u32 hash;
5624 pgoff_t idx;
5625 struct page *page = NULL;
5626 struct page *pagecache_page = NULL;
5627 struct hstate *h = hstate_vma(vma);
5628 struct address_space *mapping;
5629 int need_wait_lock = 0;
5630 unsigned long haddr = address & huge_page_mask(h);
5631
5632 ptep = huge_pte_offset(mm, haddr, huge_page_size(h));
5633 if (ptep) {
5634 /*
5635 * Since we hold no locks, ptep could be stale. That is
5636 * OK as we are only making decisions based on content and
5637 * not actually modifying content here.
5638 */
5639 entry = huge_ptep_get(ptep);
5640 if (unlikely(is_hugetlb_entry_migration(entry))) {
5641 migration_entry_wait_huge(vma, mm, ptep);
5642 return 0;
5643 } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
5644 return VM_FAULT_HWPOISON_LARGE |
5645 VM_FAULT_SET_HINDEX(hstate_index(h));
5646 }
5647
5648 /*
5649 * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold
5650 * until finished with ptep. This serves two purposes:
5651 * 1) It prevents huge_pmd_unshare from being called elsewhere
5652 * and making the ptep no longer valid.
5653 * 2) It synchronizes us with i_size modifications during truncation.
5654 *
5655 * ptep could have already be assigned via huge_pte_offset. That
5656 * is OK, as huge_pte_alloc will return the same value unless
5657 * something has changed.
5658 */
5659 mapping = vma->vm_file->f_mapping;
5660 i_mmap_lock_read(mapping);
5661 ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h));
5662 if (!ptep) {
5663 i_mmap_unlock_read(mapping);
5664 return VM_FAULT_OOM;
5665 }
5666
5667 /*
5668 * Serialize hugepage allocation and instantiation, so that we don't
5669 * get spurious allocation failures if two CPUs race to instantiate
5670 * the same page in the page cache.
5671 */
5672 idx = vma_hugecache_offset(h, vma, haddr);
5673 hash = hugetlb_fault_mutex_hash(mapping, idx);
5674 mutex_lock(&hugetlb_fault_mutex_table[hash]);
5675
5676 entry = huge_ptep_get(ptep);
5677 /* PTE markers should be handled the same way as none pte */
> 5678 if (huge_pte_none_mostly(entry)) {
5679 ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep,
5680 entry, flags);
5681 goto out_mutex;
5682 }
5683
5684 ret = 0;
5685
5686 /*
5687 * entry could be a migration/hwpoison entry at this point, so this
5688 * check prevents the kernel from going below assuming that we have
5689 * an active hugepage in pagecache. This goto expects the 2nd page
5690 * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will
5691 * properly handle it.
5692 */
5693 if (!pte_present(entry))
5694 goto out_mutex;
5695
5696 /*
5697 * If we are going to COW/unshare the mapping later, we examine the
5698 * pending reservations for this page now. This will ensure that any
5699 * allocations necessary to record that reservation occur outside the
5700 * spinlock. For private mappings, we also lookup the pagecache
5701 * page now as it is used to determine if a reservation has been
5702 * consumed.
5703 */
5704 if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) &&
5705 !huge_pte_write(entry)) {
5706 if (vma_needs_reservation(h, vma, haddr) < 0) {
5707 ret = VM_FAULT_OOM;
5708 goto out_mutex;
5709 }
5710 /* Just decrements count, does not deallocate */
5711 vma_end_reservation(h, vma, haddr);
5712
5713 if (!(vma->vm_flags & VM_MAYSHARE))
5714 pagecache_page = hugetlbfs_pagecache_page(h,
5715 vma, haddr);
5716 }
5717
5718 ptl = huge_pte_lock(h, mm, ptep);
5719
5720 /* Check for a racing update before calling hugetlb_wp() */
5721 if (unlikely(!pte_same(entry, huge_ptep_get(ptep))))
5722 goto out_ptl;
5723
5724 /* Handle userfault-wp first, before trying to lock more pages */
5725 if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(ptep)) &&
5726 (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) {
5727 struct vm_fault vmf = {
5728 .vma = vma,
5729 .address = haddr,
5730 .real_address = address,
5731 .flags = flags,
5732 };
5733
5734 spin_unlock(ptl);
5735 if (pagecache_page) {
5736 unlock_page(pagecache_page);
5737 put_page(pagecache_page);
5738 }
5739 mutex_unlock(&hugetlb_fault_mutex_table[hash]);
5740 i_mmap_unlock_read(mapping);
5741 return handle_userfault(&vmf, VM_UFFD_WP);
5742 }
5743
5744 /*
5745 * hugetlb_wp() requires page locks of pte_page(entry) and
5746 * pagecache_page, so here we need take the former one
5747 * when page != pagecache_page or !pagecache_page.
5748 */
5749 page = pte_page(entry);
5750 if (page != pagecache_page)
5751 if (!trylock_page(page)) {
5752 need_wait_lock = 1;
5753 goto out_ptl;
5754 }
5755
5756 get_page(page);
5757
5758 if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) {
5759 if (!huge_pte_write(entry)) {
5760 ret = hugetlb_wp(mm, vma, address, ptep, flags,
5761 pagecache_page, ptl);
5762 goto out_put_page;
5763 } else if (likely(flags & FAULT_FLAG_WRITE)) {
5764 entry = huge_pte_mkdirty(entry);
5765 }
5766 }
5767 entry = pte_mkyoung(entry);
5768 if (huge_ptep_set_access_flags(vma, haddr, ptep, entry,
5769 flags & FAULT_FLAG_WRITE))
5770 update_mmu_cache(vma, haddr, ptep);
5771 out_put_page:
5772 if (page != pagecache_page)
5773 unlock_page(page);
5774 put_page(page);
5775 out_ptl:
5776 spin_unlock(ptl);
5777
5778 if (pagecache_page) {
5779 unlock_page(pagecache_page);
5780 put_page(pagecache_page);
5781 }
5782 out_mutex:
5783 mutex_unlock(&hugetlb_fault_mutex_table[hash]);
5784 i_mmap_unlock_read(mapping);
5785 /*
5786 * Generally it's safe to hold refcount during waiting page lock. But
5787 * here we just wait to defer the next page fault to avoid busy loop and
5788 * the page is not used after unlocked before returning from the current
5789 * page fault. So we are safe from accessing freed page, even if we wait
5790 * here without taking refcount.
5791 */
5792 if (need_wait_lock)
5793 wait_on_page_locked(page);
5794 return ret;
5795 }
5796

--
0-DAY CI Kernel Test Service
https://01.org/lkp

Next message: Seth Heasley: "Re: [PATCH 05/11] i2c: ismt: Fix undefined behavior due to shift overflowing the constant"
Previous message: Zhang, Qiang1: "RE: [PATCH] rcu: Use IRQ_WORK_INIT_HARD() to initialize defer_qs_iw on PREEMPT_RT kernel"
In reply to: Peter Xu: "[PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults"
Next in thread: Peter Xu: "Re: [PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]