Re: [PATCH v10 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes

From: Peter Zijlstra
Date: Wed Feb 12 2025 - 04:55:03 EST


On Tue, Feb 11, 2025 at 04:08:04PM -0500, Rik van Riel wrote:

> +static void broadcast_tlb_flush(struct flush_tlb_info *info)
> +{
> + bool pmd = info->stride_shift == PMD_SHIFT;
> + unsigned long maxnr = invlpgb_count_max;
> + unsigned long asid = info->mm->context.global_asid;
> + unsigned long addr = info->start;
> + unsigned long nr;
> +
> + /* Flushing multiple pages at once is not supported with 1GB pages. */
> + if (info->stride_shift > PMD_SHIFT)
> + maxnr = 1;

How does this work?

Normally, if we get a 1GB range, we'll iterate on the stride and INVLPG
each one (just like any other stride).

Should you not instead either force the stride down to PMD level or
force a full flush?

> +
> + /*
> + * TLB flushes with INVLPGB are kicked off asynchronously.
> + * The inc_mm_tlb_gen() guarantees page table updates are done
> + * before these TLB flushes happen.
> + */
> + if (info->end == TLB_FLUSH_ALL) {
> + invlpgb_flush_single_pcid_nosync(kern_pcid(asid));
> + /* Do any CPUs supporting INVLPGB need PTI? */
> + if (static_cpu_has(X86_FEATURE_PTI))
> + invlpgb_flush_single_pcid_nosync(user_pcid(asid));
> + } else do {
> + /*
> + * Calculate how many pages can be flushed at once; if the
> + * remainder of the range is less than one page, flush one.
> + */
> + nr = min(maxnr, (info->end - addr) >> info->stride_shift);
> + nr = max(nr, 1);
> +
> + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd);
> + /* Do any CPUs supporting INVLPGB need PTI? */
> + if (static_cpu_has(X86_FEATURE_PTI))
> + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd);
> +
> + addr += nr << info->stride_shift;
> + } while (addr < info->end);
> +
> + finish_asid_transition(info);
> +
> + /* Wait for the INVLPGBs kicked off above to finish. */
> + tlbsync();
> +}