Re: [PATCH v2 09/11] mm/vmstat: use cmpxchg loop in cpu_vm_stats_fold

From: Marcelo Tosatti
Date: Thu Mar 02 2023 - 09:51:52 EST


On Wed, Mar 01, 2023 at 05:57:08PM -0500, Peter Xu wrote:
> On Thu, Feb 09, 2023 at 12:01:59PM -0300, Marcelo Tosatti wrote:
> > /*
> > - * Fold the data for an offline cpu into the global array.
> > + * Fold the data for a cpu into the global array.
> > * There cannot be any access by the offline cpu and therefore
> > * synchronization is simplified.
> > */
> > @@ -906,8 +906,9 @@ void cpu_vm_stats_fold(int cpu)
> > if (pzstats->vm_stat_diff[i]) {
> > int v;
> >
> > - v = pzstats->vm_stat_diff[i];
> > - pzstats->vm_stat_diff[i] = 0;
> > + do {
> > + v = pzstats->vm_stat_diff[i];
> > + } while (!try_cmpxchg(&pzstats->vm_stat_diff[i], &v, 0));
>
> IIUC try_cmpxchg will update "v" already, so I'd assume this'll work the
> same:
>
> while (!try_cmpxchg(&pzstats->vm_stat_diff[i], &v, 0));
>
> Then I figured, maybe it's easier to use xchg()?

Yes, fixed.

> I've no knowledge at all on cpu offline code, so sorry if this will be a
> naive question. But from what I understand this should not be touched by
> anyone else. Reasons:
>
> (1) cpu_vm_stats_fold() is only called in page_alloc_cpu_dead(), and the
> comment says:
>
> /*
> * Zero the differential counters of the dead processor
> * so that the vm statistics are consistent.
> *
> * This is only okay since the processor is dead and cannot
> * race with what we are doing.
> */
> cpu_vm_stats_fold(cpu);
>
> so.. I think that's what it says..

This refers to the use of this_cpu operations being performed by the
counter updates.

If both the updater and reader use atomic accesses (which is the case after patch 8:
"mm/vmstat: switch counter modification to cmpxchg"), and
CONFIG_HAVE_CMPXCHG_LOCAL is set, then the comment is stale.

Removed it.

> (2) If someone can modify the dead cpu's vm_stat_diff,

The only context that can modify the cpu's vm_stat_diff are:

1) The CPU itself (increases the counter).
2) cpu_vm_stats_fold (from vmstat_shepherd kernel thread), from
x -> 0 only.

So you should not be able to increase the counter after this point.
I suppose this is what this comment refers to.

> what guarantees it
> won't be e.g. boosted again right after try_cmpxchg() / xchg()
> returns? What to do with the left-overs?

If any code runs on the CPU that is being hotunplugged,
after cpu_vm_stats_fold (from page_alloc_cpu_dead), then there will
be left-overs. But such bugs would exist today as well.

Or, if that bug exists, you could replace "for_each_online_cpu" to
"for_each_cpu" here:

static void vmstat_shepherd(struct work_struct *w)
{
int cpu;

cpus_read_lock();
/* Check processors whose vmstat worker threads have been disabled */
for_each_online_cpu(cpu) {