Re: Cache flush issue with page_mapping_file() and swap back shmem page ?
From: Hugh Dickins
Date: Wed May 27 2020 - 23:47:02 EST
Hi Jerome,
On Wed, 27 May 2020, Jerome Glisse wrote:
> So any arch code which uses page_mapping_file() might get the wrong
> answer, this function will return NULL for a swap backed page which
> can be a shmem pages. But shmem pages can still be shared among
> multiple process (and possibly at different virtual addresses if
> mremap was use).
>
> Attached is a patch that changes page_mapping_file() to return the
> shmem mapping for swap backed shmem page. I have not tested it (no
> way for me to test all those architecture) and i spotted this while
> working on something else. So i hope someone can take a closer look.
I'm certainly no expert on flush_dcache_page() and friends, but I'd
be very surprised if such a problem exists, yet has gone unnoticed
for so long. page_mapping_file() itself is fairly new, added when
a risk of crashing on a race with swapoff came in: but the previous
use of page_mapping() would have suffered equally if there were such
a cache flushinhg problem here.
And I'm afraid your patch won't do anything to help if there is a
problem: very soon after shmem calls add_to_swap_cache(), it calls
shmem_delete_from_page_cache(), which sets page->mapping to NULL.
But I can assure you that a shmem page (unlike an anon page) is never
put into swap cache while it is mapped into userspace, and never
mapped into userspace while it is still in swap cache: does that help?
Hugh
> This might be a shmem page that is in a sense a file that
> can be mapped multiple times in different processes at
> possibly different virtual addresses (fork + mremap). So
> return the shmem mapping that will allow any arch code to
> find all mappings of the page.
>
> Note that even if page is not anonymous then the page might
> have a NULL page->mapping field if it is being truncated,
> but then it is fine as each pte poiting to the page will be
> remove and cache flushing should be handled properly by that
> part of the code.
>
> Signed-off-by: Jerome Glisse <jglisse@xxxxxxxxxx>
> Cc: "Huang, Ying" <ying.huang@xxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Cc: Russell King <linux@xxxxxxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>
> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> Cc: "James E.J. Bottomley" <jejb@xxxxxxxxxxxxxxxx>
> ---
> mm/util.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/mm/util.c b/mm/util.c
> index 988d11e6c17c..ec8739ab0cc3 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -685,8 +685,24 @@ EXPORT_SYMBOL(page_mapping);
> */
> struct address_space *page_mapping_file(struct page *page)
> {
> - if (unlikely(PageSwapCache(page)))
> + if (unlikely(PageSwapCache(page))) {
> + /*
> + * This might be a shmem page that is in a sense a file that
> + * can be mapped multiple times in different processes at
> + * possibly different virtual addresses (fork + mremap). So
> + * return the shmem mapping that will allow any arch code to
> + * find all mappings of the page.
> + *
> + * Note that even if page is not anonymous then the page might
> + * have a NULL page->mapping field if it is being truncated,
> + * but then it is fine as each pte poiting to the page will be
> + * remove and cache flushing should be handled properly by that
> + * part of the code.
> + */
> + if (!PageAnon(page))
> + return page->mapping;
> return NULL;
> + }
> return page_mapping(page);
> }
>
> --
> 2.26.2