Re: [PATCH] mm: migrate: add missing flush_dcache_page for non-mapped page migrate

From: Anshuman Khandual
Date: Tue Feb 26 2019 - 04:23:31 EST




On 02/19/2019 06:02 PM, Lars Persson wrote:
> Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
> and SIGSEGV that could not be traced back to a userspace code
> bug. They had all the magic signs of an I/D cache coherency issue.
>
> Now recently we noticed that the /proc/sys/vm/compact_memory interface
> was quite efficient at provoking this class of userspace crashes.
>
> Studying the code in mm/migrate.c there is a distinction made between
> migrating a page that is mapped at the instant of migration and one
> that is not mapped. Our problem turned out to be the non-mapped pages.
>
> For the non-mapped page the code performs a copy of the page content
> and all relevant meta-data of the page without doing the required
> D-cache maintenance. This leaves dirty data in the D-cache of the CPU
> and on the 1004K cores this data is not visible to the I-cache. A
> subsequent page-fault that triggers a mapping of the page will happily
> serve the process with potentially stale code.

Just curious. Is not the code path which tries to map this page should
do the invalidation just before setting it up in the page table via
set_pte_at() or other similar variants ? How it maps without doing the
necessary flush.