Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range

From: Zhenyu Ye
Date: Thu Apr 02 2020 - 07:24:22 EST


Hi Peter,

On 2020/4/1 20:20, Peter Zijlstra wrote:
> On Wed, Apr 01, 2020 at 04:51:15PM +0800, Zhenyu Ye wrote:
>> On 2020/3/31 23:13, Peter Zijlstra wrote:
>
>>> Instead of trying to retro-fit flush_*tlb_range() to take an mmu_gather
>>> parameter, please replace them out-right.
>>>
>>
>> I'm sorry that I'm not sure what "replace them out-right" means. Do you
>> mean that I should define flush_*_tlb_range like this?
>>
>> #define flush_pmd_tlb_range(vma, addr, end) \
>> do { \
>> struct mmu_gather tlb; \
>> tlb_gather_mmu(&tlb, (vma)->vm_mm, addr, end); \
>> tlba.cleared_pmds = 1; \
>> flush_tlb_range(&tlb, vma, addr, end); \
>> tlb_finish_mmu(&tlb, addr, end); \
>> } while (0)
>>
>
> I was thinking to remove flush_*tlb_range() entirely (from generic
> code).
>
> And specifically to not use them like the above; instead extend the
> mmu_gather API.
>
> Specifically, if you wanted to express flush_pmd_tlb_range() in mmu
> gather, you'd write it like:
>
> static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end)
> {
> struct mmu_gather tlb;
>
> tlb_gather_mmu(&tlb, vma->vm_mm, addr, end);
> tlb_start_vma(&tlb, vma);
> tlb.cleared_pmds = 1;
> __tlb_adjust_range(addr, end - addr);
> tlb_end_vma(&tlb, vma);
> tlb_finish_mmu(&tlb, addr, end);
> }
>
> Except of course, that the code between start_vma and end_vma is not a
> proper mmu_gather API.
>
> So maybe add:
>
> tlb_flush_{pte,pmd,pud,p4d}_range()
>
> Then we can write:
>
> static inline void flush_XXX_tlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end)
> {
> struct mmu_gather tlb;
>
> tlb_gather_mmu(&tlb, vma->vm_mm, addr, end);
> tlb_start_vma(&tlb, vma);
> tlb_flush_XXX_range(&tlb, addr, end - addr);
> tlb_end_vma(&tlb, vma);
> tlb_finish_mmu(&tlb, addr, end);
> }
>
> But when I look at the output of:
>
> git grep flush_.*tlb_range -- :^arch/
>
> I doubt it makes sense to provide wrappers like the above.
>

Thanks for your detailed explanation. I notice that you used
`tlb_end_vma` replace `flush_tlb_range`, which will call `tlb_flush`,
then finally call `flush_tlb_range` in generic code. However, some
architectures define tlb_end_vma|tlb_flush|flush_tlb_range themselves,
so this may cause problems.

For example, in s390, it defines:

#define tlb_end_vma(tlb, vma) do { } while (0)

And it doesn't define it's own flush_pmd_tlb_range(). So there will be
a mistake if we changed flush_pmd_tlb_range() using tlb_end_vma().

Is this really a problem or something I understand wrong ?



If true, I think there are three ways to solve this problem:

1. use `flush_tlb_range` rather than `tlb_end_vma` in flush_XXX_tlb_range;
In this way, we still need retro-fit `flush_tlb_range` to take an mmu_gather
parameter.

2. use `tlb_flush` rather than `tlb_end_vma`.
There is a constraint such like:

#ifndef tlb_flush
#if defined(tlb_start_vma) || defined(tlb_end_vma)
#error Default tlb_flush() relies on default tlb_start_vma() and tlb_end_vma()
#endif

So all architectures that define tlb_{start|end}_vma have defined tlb_flush.
Also, we can add a constraint to flush_XXX_tlb_range such like:

#ifndef flush_XXX_tlb_range
#if defined(tlb_start_vma) || defined(tlb_end_vma)
#error Default flush_XXX_tlb_range() relies on default tlb_start/end_vma()
#endif

3. Define flush_XXX_tlb_range() architecture-self, and keep original define in
generic code, such as:

In arm64:
#define flush_XXX_tlb_range flush_XXX_tlb_range

In generic:
#ifndef flush_XXX_tlb_range
#define flush_XXX_tlb_range flush_tlb_range


Which do you think is more appropriate?


> ( Also, we should probably remove the (addr, end) arguments from
> tlb_finish_mmu(), Will? )
>

This can be changed quickly. If you want I can do this with a
separate patch.

> ---
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index f391f6b500b4..be5452a8efaa 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -511,6 +511,34 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
> }
> #endif
>
> +static inline void tlb_flush_pte_range(struct mmu_gather *tlb,
> + unsigned long address, unsigned long size)
> +{
> + __tlb_adjust_range(tlb, address, size);
> + tlb->cleared_ptes = 1;
> +}
> +
> +static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
> + unsigned long address, unsigned long size)
> +{
> + __tlb_adjust_range(tlb, address, size);
> + tlb->cleared_pmds = 1;
> +}
> +
> +static inline void tlb_flush_pud_range(struct mmu_gather *tlb,
> + unsigned long address, unsigned long size)
> +{
> + __tlb_adjust_range(tlb, address, size);
> + tlb->cleared_puds = 1;
> +}
> +
> +static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
> + unsigned long address, unsigned long size)
> +{
> + __tlb_adjust_range(tlb, address, size);
> + tlb->cleared_p4ds = 1;
> +}
> +

By the way, I think the name of tlb_set_XXX_range() is more suitable, because
we don't do actual flush there.

> #ifndef __tlb_remove_tlb_entry
> #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
> #endif
> @@ -524,8 +552,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
> */
> #define tlb_remove_tlb_entry(tlb, ptep, address) \
> do { \
> - __tlb_adjust_range(tlb, address, PAGE_SIZE); \
> - tlb->cleared_ptes = 1; \
> + tlb_flush_pte_range(tlb, address, PAGE_SIZE); \
> __tlb_remove_tlb_entry(tlb, ptep, address); \
> } while (0)
>
> @@ -550,8 +577,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>
> #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address) \
> do { \
> - __tlb_adjust_range(tlb, address, HPAGE_PMD_SIZE); \
> - tlb->cleared_pmds = 1; \
> + tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE); \
> __tlb_remove_pmd_tlb_entry(tlb, pmdp, address); \
> } while (0)
>
> @@ -565,8 +591,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>
> #define tlb_remove_pud_tlb_entry(tlb, pudp, address) \
> do { \
> - __tlb_adjust_range(tlb, address, HPAGE_PUD_SIZE); \
> - tlb->cleared_puds = 1; \
> + tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE); \
> __tlb_remove_pud_tlb_entry(tlb, pudp, address); \
> } while (0)
>
>
> .
>