Re: [PATCH v9 3/5] mm/vmstat: manage per-CPU stats from CPU context when NOHZ full

From: Frederic Weisbecker
Date: Wed Dec 14 2022 - 08:18:52 EST


On Tue, Dec 06, 2022 at 01:18:29PM -0300, Marcelo Tosatti wrote:
> For nohz full CPUs, manage per-CPU stat syncing from CPU context:
> start delayed work when marking per-CPU vmstat dirty.
>
> When returning to userspace, fold the stats and cancel the delayed work.
>
> When entering idle, only fold the stats.

The changelog still misses the reason behind the changes.

> @@ -195,9 +196,24 @@ void fold_vm_numa_events(void)
>
> #ifdef CONFIG_SMP
> static DEFINE_PER_CPU_ALIGNED(bool, vmstat_dirty);
> +static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
> +int sysctl_stat_interval __read_mostly = HZ;
>
> static inline void vmstat_mark_dirty(void)
> {
> + int cpu = smp_processor_id();
> +
> + if (tick_nohz_full_cpu(cpu) && !this_cpu_read(vmstat_dirty)) {
> + struct delayed_work *dw;
> +
> + dw = &per_cpu(vmstat_work, cpu);

this_cpu_ptr()

> + if (!delayed_work_pending(dw)) {
> + unsigned long delay;
> +
> + delay = round_jiffies_relative(sysctl_stat_interval);
> + queue_delayed_work_on(cpu, mm_percpu_wq, dw, delay);
> + }
> + }
> this_cpu_write(vmstat_dirty, true);
> }
>
> @@ -1973,21 +1986,27 @@ static void vmstat_update(struct work_st
> * until the diffs stay at zero. The function is used by NOHZ and can only be
> * invoked when tick processing is not active.
> */
> -void quiet_vmstat(void)
> +void quiet_vmstat(bool user)
> {
> + struct delayed_work *dw;
> +
> if (system_state != SYSTEM_RUNNING)
> return;
>
> if (!is_vmstat_dirty())
> return;
>
> + refresh_cpu_vm_stats(false);
> +
> + if (!user)
> + return;
> /*
> - * Just refresh counters and do not care about the pending delayed
> - * vmstat_update. It doesn't fire that often to matter and canceling
> - * it would be too expensive from this path.
> - * vmstat_shepherd will take care about that for us.
> + * If the tick is stopped, cancel any delayed work to avoid
> + * interruptions to this CPU in the future.
> */
> - refresh_cpu_vm_stats(false);
> + dw = &per_cpu(vmstat_work, smp_processor_id());

this_cpu_ptr()

Thanks.

> + if (delayed_work_pending(dw))
> + cancel_delayed_work(dw);
> }
>
> /*