Re: [PATCH 1/2] mm: swap: update inuse_pages after all cleanups are done
From: Yosry Ahmed
Date: Tue Jan 23 2024 - 23:16:26 EST
On Tue, Jan 23, 2024 at 7:29 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>
> Yosry Ahmed <yosryahmed@xxxxxxxxxx> writes:
>
> >> > In swap_range_free, we want to make sure that the write to
> >> > si->inuse_pages in swap_range_free() happens *after* the cleanups
> >> > (specifically zswap_invalidate() in this case).
> >> > In swap_off, we want to make sure that the cleanups following
> >> > try_to_unuse() (e.g. zswap_swapoff) happen *after* reading
> >> > si->inuse_pages == 0 in try_to_unuse().
> >> >
> >> > So I think we want smp_wmb() in swap_range_free() and smp_mb() in
> >> > try_to_unuse(). Does the below look correct to you?
> >> >
> >> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> >> > index 2fedb148b9404..a2fa2f65a8ddd 100644
> >> > --- a/mm/swapfile.c
> >> > +++ b/mm/swapfile.c
> >> > @@ -750,6 +750,12 @@ static void swap_range_free(struct
> >> > swap_info_struct *si, unsigned long offset,
> >> > offset++;
> >> > }
> >> > clear_shadow_from_swap_cache(si->type, begin, end);
> >> > +
> >> > + /*
> >> > + * Make sure that try_to_unuse() observes si->inuse_pages reaching 0
> >> > + * only after the above cleanups are done.
> >> > + */
> >> > + smp_wmb();
> >> > atomic_long_add(nr_entries, &nr_swap_pages);
> >> > WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
> >> > }
> >> > @@ -2130,6 +2136,11 @@ static int try_to_unuse(unsigned int type)
> >> > return -EINTR;
> >> > }
> >> >
> >> > + /*
> >> > + * Make sure that further cleanups after try_to_unuse() returns happen
> >> > + * after swap_range_free() reduces si->inuse_pages to 0.
> >> > + */
> >> > + smp_mb();
> >> > return 0;
> >> > }
> >>
> >> We need to take care of "si->inuse_pages" checking at the beginning of
> >> try_to_unuse() too. Otherwise, it looks good to me.
> >
> > Hmm, why isn't one barrier at the end of the function enough? I think
> > all we need is that before we return from try_to_unuse(), all the
> > cleanups in swap_range_free() are taken care of, which the barrier at
> > the end should be doing. We just want instructions after
> > try_to_unuse() to not get re-ordered before si->inuse_pages is read as
> > 0, right?
>
> Because at the begin of try_to_unuse() as below, after reading, function
> returns directly without any memory barriers.
>
> if (!READ_ONCE(si->inuse_pages))
> return 0;
Right, I missed this one. Let me fix this up and send a v2.
Thanks!