Re: [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

From: Yang Shi
Date: Mon Nov 01 2021 - 15:26:43 EST


On Mon, Nov 1, 2021 at 12:05 PM Naresh Kamboju
<naresh.kamboju@xxxxxxxxxx> wrote:
>
> Hi Yang,
>
> On Fri, 1 Oct 2021 at 03:23, Yang Shi <shy828301@xxxxxxxxx> wrote:
> >
> > When handling shmem page fault the THP with corrupted subpage could be PMD
> > mapped if certain conditions are satisfied. But kernel is supposed to
> > send SIGBUS when trying to map hwpoisoned page.
> >
> > There are two paths which may do PMD map: fault around and regular fault.
> >
> > Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
> > the thing was even worse in fault around path. The THP could be PMD mapped as
> > long as the VMA fits regardless what subpage is accessed and corrupted. After
> > this commit as long as head page is not corrupted the THP could be PMD mapped.
> >
> > In the regular fault path the THP could be PMD mapped as long as the corrupted
> > page is not accessed and the VMA fits.
> >
> > This loophole could be fixed by iterating every subpage to check if any
> > of them is hwpoisoned or not, but it is somewhat costly in page fault path.
> >
> > So introduce a new page flag called HasHWPoisoned on the first tail page. It
> > indicates the THP has hwpoisoned subpage(s). It is set if any subpage of THP
> > is found hwpoisoned by memory failure and cleared when the THP is freed or
> > split.
> >
> > Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
> > Cc: <stable@xxxxxxxxxxxxxxx>
> > Suggested-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > Signed-off-by: Yang Shi <shy828301@xxxxxxxxx>
> > ---
> > include/linux/page-flags.h | 19 +++++++++++++++++++
> > mm/filemap.c | 12 ++++++------
> > mm/huge_memory.c | 2 ++
> > mm/memory-failure.c | 6 +++++-
> > mm/memory.c | 9 +++++++++
> > mm/page_alloc.c | 4 +++-
> > 6 files changed, 44 insertions(+), 8 deletions(-)
>
> When CONFIG_MEMORY_FAILURE not set
> we get these build failures.

Thanks for catching this. It is because Willy's page folio series
changed the definition of PAGEFLAG_FALSE macro. But patch was new in
5.15-rc7, so his series doesn't cover this.

The below patch should be able to fix it:

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index d8623d6e1141..981341a3c3c4 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -803,8 +803,8 @@ PAGEFLAG_FALSE(DoubleMap, double_map)
PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
#else
-PAGEFLAG_FALSE(HasHWPoisoned)
- TESTSCFLAG_FALSE(HasHWPoisoned)
+PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
+ TESTSCFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
#endif

/*

I will prepare a formal patch for 5.16.

>
> Regression found on x86_64 and i386 gcc-11 builds
> Following build warnings / errors reported on Linux mainline master.
>
> metadata:
> git_describe: v5.15-559-g19901165d90f
> git_repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git_short_log: 19901165d90f (\"Merge tag
> 'for-5.16/inode-sync-2021-10-29' of git://git.kernel.dk/linux-block\")
> target_arch: x86_64
> toolchain: gcc-11
>
>
> In file included from include/linux/mmzone.h:22,
> from include/linux/gfp.h:6,
> from include/linux/slab.h:15,
> from include/linux/crypto.h:20,
> from arch/x86/kernel/asm-offsets.c:9:
> include/linux/page-flags.h:806:29: error: macro "PAGEFLAG_FALSE"
> requires 2 arguments, but only 1 given
> 806 | PAGEFLAG_FALSE(HasHWPoisoned)
> | ^
> include/linux/page-flags.h:411: note: macro "PAGEFLAG_FALSE" defined here
> 411 | #define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname,
> lname) \
> |
> include/linux/page-flags.h:807:39: error: macro "TESTSCFLAG_FALSE"
> requires 2 arguments, but only 1 given
> 807 | TESTSCFLAG_FALSE(HasHWPoisoned)
> | ^
> include/linux/page-flags.h:414: note: macro "TESTSCFLAG_FALSE" defined here
> 414 | #define TESTSCFLAG_FALSE(uname, lname)
> \
> |
> include/linux/page-flags.h:806:1: error: unknown type name 'PAGEFLAG_FALSE'
> 806 | PAGEFLAG_FALSE(HasHWPoisoned)
> | ^~~~~~~~~~~~~~
> include/linux/page-flags.h:807:25: error: expected ';' before 'static'
> 807 | TESTSCFLAG_FALSE(HasHWPoisoned)
> | ^
> | ;
> ......
> 815 | static inline bool is_page_hwpoison(struct page *page)
> | ~~~~~~
> make[2]: *** [scripts/Makefile.build:121: arch/x86/kernel/asm-offsets.s] Error 1
>
> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
>
> build link:
> -----------
> https://builds.tuxbuild.com/20KPBpXK6K0bKSIKAIKfwlBq7O4/build.log
>
> build config:
> -------------
> https://builds.tuxbuild.com/20KPBpXK6K0bKSIKAIKfwlBq7O4/config
>
> # To install tuxmake on your system globally
> # sudo pip3 install -U tuxmake
>
> tuxmake --runtime podman --target-arch x86_64 --toolchain gcc-11
> --kconfig defconfig --kconfig-add
> https://builds.tuxbuild.com/20KPBpXK6K0bKSIKAIKfwlBq7O4/config
>
> link:
> https://builds.tuxbuild.com/20KPBpXK6K0bKSIKAIKfwlBq7O4/tuxmake_reproducer.sh
>
> --
> Linaro LKFT
> https://lkft.linaro.org