Re: [PATCh 0/3] x86,tlb: context switch optimizations

From: Peter Zijlstra
Date: Thu Nov 14 2024 - 06:36:38 EST


On Wed, Nov 13, 2024 at 09:55:57AM -0500, Rik van Riel wrote:
> arch/x86/kernel/alternative.c | 11 +++--------
> 1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index d17518ca19b8..f3caf5bc4df9 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1830,6 +1830,9 @@ static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
> lockdep_assert_irqs_disabled();
> switch_mm_irqs_off(NULL, prev_state.mm, current);
>
> + /* Force a TLB flush next time poking_mm is used. */
> + inc_mm_tlb_gen(poking_mm);
> +
> /*
> * Restore the breakpoints if they were disabled before the temporary mm
> * was loaded.
> @@ -1940,14 +1943,6 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
> */
> unuse_temporary_mm(prev);
>
> - /*
> - * Flushing the TLB might involve IPIs, which would require enabled
> - * IRQs, but not if the mm is not used, as it is in this point.
> - */
> - flush_tlb_mm_range(poking_mm, poking_addr, poking_addr +
> - (cross_page_boundary ? 2 : 1) * PAGE_SIZE,
> - PAGE_SHIFT, false);
> -
> if (func == text_poke_memcpy) {
> /*
> * If the text does not match what we just wrote then something is

So I really don't like this.

Yes it avoids the immediate problem, but we're now sending IPIs where we
shoulnd't be.

Fundamentally this whole text_poke thing is set up such that only a
single CPU will have this magical mapping with the aliasses. Having it
send IPIs is crazy.