Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access

From: Huacai Chen
Date: Mon Oct 14 2024 - 02:31:44 EST


Hi, Bibo,

On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@xxxxxxxxxxx> wrote:
>
> It is possible to return a spurious fault if memory is accessed
> right after the pte is set. For user address space, pte is set
> in kernel space and memory is accessed in user space, there is
> long time for synchronization, no barrier needed. However for
> kernel address space, it is possible that memory is accessed
> right after the pte is set.
>
> Here flush_cache_vmap/flush_cache_vmap_early is used for
> synchronization.
>
> Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx>
> ---
> arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
> index f8754d08a31a..53be231319ef 100644
> --- a/arch/loongarch/include/asm/cacheflush.h
> +++ b/arch/loongarch/include/asm/cacheflush.h
> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
> #define flush_cache_dup_mm(mm) do { } while (0)
> #define flush_cache_range(vma, start, end) do { } while (0)
> #define flush_cache_page(vma, vmaddr, pfn) do { } while (0)
> -#define flush_cache_vmap(start, end) do { } while (0)
> #define flush_cache_vunmap(start, end) do { } while (0)
> #define flush_icache_user_page(vma, page, addr, len) do { } while (0)
> #define flush_dcache_mmap_lock(mapping) do { } while (0)
> #define flush_dcache_mmap_unlock(mapping) do { } while (0)
>
> +/*
> + * It is possible for a kernel virtual mapping access to return a spurious
> + * fault if it's accessed right after the pte is set. The page fault handler
> + * does not expect this type of fault. flush_cache_vmap is not exactly the
> + * right place to put this, but it seems to work well enough.
> + */
> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
> +{
> + smp_mb();
> +}
> +#define flush_cache_vmap flush_cache_vmap
> +#define flush_cache_vmap_early flush_cache_vmap
>From the history of flush_cache_vmap_early(), It seems only archs with
"virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
no-op here.

And I still think flush_cache_vunmap() should be a smp_mb(). A
smp_mb() in flush_cache_vmap() prevents subsequent accesses be
reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
prevents preceding accesses be reordered after pte_clear(). This
potential problem may not be seen from experiment, but it is needed in
theory.

Huacai

> +
> #define cache_op(op, addr) \
> __asm__ __volatile__( \
> " cacop %0, %1 \n" \
> --
> 2.39.3
>
>