Re: [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA

From: Andrew Morton
Date: Wed Feb 19 2020 - 16:22:43 EST


On Tue, 18 Feb 2020 18:41:51 +0300 "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> wrote:

> Jeff Moyer has reported that one of xfstests triggers a warning when run
> on DAX-enabled filesystem:
>
> WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
> ...
> wp_page_copy+0x98c/0xd50 (unreliable)
> do_wp_page+0xd8/0xad0
> __handle_mm_fault+0x748/0x1b90
> handle_mm_fault+0x120/0x1f0
> __do_page_fault+0x240/0xd70
> do_page_fault+0x38/0xd0
> handle_page_fault+0x10/0x30
>
> The warning happens on failed __copy_from_user_inatomic() which tries to
> copy data into a CoW page.
>
> This happens because of race between MADV_DONTNEED and CoW page fault:
>
> CPU0 CPU1
> handle_mm_fault()
> do_wp_page()
> wp_page_copy()
> do_wp_page()
> madvise(MADV_DONTNEED)
> zap_page_range()
> zap_pte_range()
> ptep_get_and_clear_full()
> <TLB flush>
> __copy_from_user_inatomic()
> sees empty PTE and fails
> WARN_ON_ONCE(1)
> clear_page()
>
> The solution is to re-try __copy_from_user_inatomic() under PTL after
> checking that PTE is matches the orig_pte.
>
> The second copy attempt can still fail, like due to non-readable PTE,
> but there's nothing reasonable we can do about, except clearing the CoW
> page.

You don't think this is worthy of a cc:stable?