Re: [PATCH] mm, swap: Remove unnecessary smp_rmb() in swap_type_to_swap_info()

From: Peter Zijlstra
Date: Fri May 14 2021 - 08:04:40 EST


On Thu, May 13, 2021 at 09:59:46PM -0400, Daniel Jordan wrote:
> On Thu, May 13, 2021 at 02:46:10PM +0200, Peter Zijlstra wrote:
> > Ah, I think I see what you meant to say, it would perhaps help if you
> > write it like so:
> >
> >
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 149e77454e3c..94735248dcd2 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -99,11 +99,10 @@ atomic_t nr_rotate_swap = ATOMIC_INIT(0);
> >
> > static struct swap_info_struct *swap_type_to_swap_info(int type)
> > {
> > - if (type >= READ_ONCE(nr_swapfiles))
> > + if (type >= MAX_SWAPFILES)
> > return NULL;
> >
> > - smp_rmb(); /* Pairs with smp_wmb in alloc_swap_info. */
> > - return READ_ONCE(swap_info[type]);
> > + return READ_ONCE(swap_info[type]); /* rcu_dereference() */
> > }
> >
> > static inline unsigned char swap_count(unsigned char ent)
> > @@ -2869,14 +2868,11 @@ static struct swap_info_struct *alloc_swap_info(void)
> > }
> > if (type >= nr_swapfiles) {
> > p->type = type;
> > - WRITE_ONCE(swap_info[type], p);
> > /*
> > - * Write swap_info[type] before nr_swapfiles, in case a
> > - * racing procfs swap_start() or swap_next() is reading them.
> > - * (We never shrink nr_swapfiles, we never free this entry.)
> > + * Publish the swap_info_struct.
> > */
> > - smp_wmb();
> > - WRITE_ONCE(nr_swapfiles, nr_swapfiles + 1);
> > + smp_store_release(&swap_info[type], p); /* rcu_assign_pointer() */
> > + nr_swapfiles++;
>
> Yes, this does help, I didn't understand why smp_wmb stayed around in
> the original post.
>
> I think the only access smp_store_release() orders is p->type. Wouldn't
> it be kinda inconsistent to only initialize that one field before
> publishing when many others would be done at the end of
> alloc_swap_info() after the fact? p->type doesn't seem special. For
> instance, get_swap_page_of_type() touches si->lock soon after it calls
> swap_type_to_swap_info(), so there could be a small window where there's
> a non-NULL si with an uninitialized lock.
>
> It's not as if this is likely to be a problem in practice, it would just
> make it harder to understand why smp_store_release is there. Maybe all
> we need is a WRITE_ONCE, or if it's really necessary for certain fields
> to be set before publication then move them up and explain?

You also care about the zero fill from kvzalloc(). Without the
smp_store_release() the zero-fill from the memset() might only be
visible 'late'.

Unless that also isn't a problem?