Re: [PATCH 08/11] x86/mm: enable broadcast TLB invalidation for multi-threaded processes

From: Nadav Amit
Date: Wed Dec 25 2024 - 18:22:39 EST




On 23 Dec 2024, at 4:55, Rik van Riel <riel@xxxxxxxxxxx> wrote:

> +static int mm_active_cpus(struct mm_struct *mm)
> +{
> + int count = 0;
> + int cpu;
> +
> + for_each_cpu(cpu, mm_cpumask(mm)) {
> + /* Skip the CPUs that aren't really running this process. */
> + if (per_cpu(cpu_tlbstate.loaded_mm, cpu) != mm)
> + continue;
> +
> + if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu))
> + continue;
> +
> + count++;
> + }
> + return count;
> +}

Since you are only interested in checking whether the number of “mm active
CPUs" is greater than a certain threshold, don’t you want to add some
checks for early termination? This can allow to avoid cachelines of
cpu_tlbstate traversing back and forth.

For instance, by running cpumask_weight() first, if the weight is lower than
the threshold, no need to loop. Similarly, if inside the loop the threshold
has already been crossed, no need for more iterations.