Re: [PATCH v3 05/11] x86/mm: Track the TLB's tlb_gen and update the flushing algorithm

From: Thomas Gleixner
Date: Wed Jun 21 2017 - 04:33:03 EST


On Tue, 20 Jun 2017, Andy Lutomirski wrote:
> struct flush_tlb_info {
> + /*
> + * We support several kinds of flushes.
> + *
> + * - Fully flush a single mm. flush_mm will be set, flush_end will be

flush_mm is the *mm member in the struct, right? You might rename that as a
preparatory step so comments and implementation match.

> + * TLB_FLUSH_ALL, and new_tlb_gen will be the tlb_gen to which the
> + * IPI sender is trying to catch us up.
> + *
> + * - Partially flush a single mm. flush_mm will be set, flush_start
> + * and flush_end will indicate the range, and new_tlb_gen will be
> + * set such that the changes between generation new_tlb_gen-1 and
> + * new_tlb_gen are entirely contained in the indicated range.
> + *
> + * - Fully flush all mms whose tlb_gens have been updated. flush_mm
> + * will be NULL, flush_end will be TLB_FLUSH_ALL, and new_tlb_gen
> + * will be zero.
> + */
> struct mm_struct *mm;
> unsigned long start;
> unsigned long end;
> + u64 new_tlb_gen;

Nit. While at it could you please make that struct tabular aligned as we
usually do in x86?

> static void flush_tlb_func_common(const struct flush_tlb_info *f,
> bool local, enum tlb_flush_reason reason)
> {
> + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
> +
> + /*
> + * Our memory ordering requirement is that any TLB fills that
> + * happen after we flush the TLB are ordered after we read
> + * active_mm's tlb_gen. We don't need any explicit barrier
> + * because all x86 flush operations are serializing and the
> + * atomic64_read operation won't be reordered by the compiler.
> + */

Can you please move the comment above the loaded_mm assignment?

> + u64 mm_tlb_gen = atomic64_read(&loaded_mm->context.tlb_gen);
> + u64 local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[0].tlb_gen);
> +
> /* This code cannot presently handle being reentered. */
> VM_WARN_ON(!irqs_disabled());
>
> + VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[0].ctx_id) !=
> + loaded_mm->context.ctx_id);
> +
> if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
> + /*
> + * leave_mm() is adequate to handle any type of flush, and
> + * we would prefer not to receive further IPIs.

While I know what you mean, it might be useful to have a more elaborate
explanation why this prevents new IPIs.

> + */
> leave_mm(smp_processor_id());
> return;
> }
>
> - if (f->end == TLB_FLUSH_ALL) {
> - local_flush_tlb();
> - if (local)
> - count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
> - trace_tlb_flush(reason, TLB_FLUSH_ALL);
> - } else {
> + if (local_tlb_gen == mm_tlb_gen) {
> + /*
> + * There's nothing to do: we're already up to date. This can
> + * happen if two concurrent flushes happen -- the first IPI to
> + * be handled can catch us all the way up, leaving no work for
> + * the second IPI to be handled.

That not restricted to IPIs, right? A local flush / IPI combo can do that
as well.

Other than those nits;

Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>