Re: [RFC PATCH] fs: elide the smp_rmb fence in fd_install()

From: Paul E. McKenney
Date: Thu Dec 05 2024 - 15:12:04 EST


On Thu, Dec 05, 2024 at 08:47:24PM +0100, Mateusz Guzik wrote:
> On Thu, Dec 5, 2024 at 8:26 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, 5 Dec 2024 at 10:41, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > > To my understanding this is the idiomatic way of spelling out the
> > > > non-existent in Linux smp_consume_load, for the resize_in_progress
> > > > flag.
> > >
> > > In Linus, "smp_consume_load()" is named rcu_dereference().
> >
> > Linux.
> >
> > But yes and no.
> >
> > It's worth making it really really clear that "rcu_dereference()" is
> > *not* just a different name for some "smp_consume_load()" operation.
> >
> > Why? Because a true smp_consume_load() would work with any random kind
> > of flags etc. And rcu_dereference() works only because it's a pointer,
> > and there's an inherent data dependency to what the result points to.
> >
> > Paul obviously knows this, but let's make it very clear in this
> > discussion, because if somebody decided "I want a smp_consume_load(),
> > and I'll use rcu_dereference() to do that", the end result would
> > simply not work for arbitrary data, like a flags field or something,
> > where comparing it against a value will only result in a control
> > dependency, not an actual data dependency.
>
> So I checked for kicks and rcu_dereference comes with type checking,
> as in passing something which is not a pointer even fails to compile.

That is by design, keeping in mind that consume loads order only
later dereferences against the pointer load.

> I'll note thought that a smp_load_consume_ptr or similarly named
> routine would be nice and I'm rather confused why it was not added
> given smp_load_acquire and smp_store_release being there.

In recent kernels, READ_ONCE() is your smp_load_consume_ptr(). Or things
like rcu_dereference_check(p, 1). But these can be used only when the
pointed-to object is guaranteed to live (as in not be freed) for the
full duration of the read-side use of that pointer.

> One immediate user would be mnt_idmap(), like so:
> iff --git a/include/linux/mount.h b/include/linux/mount.h
> index 33f17b6e8732..4d3486ff67ed 100644
> --- a/include/linux/mount.h
> +++ b/include/linux/mount.h
> @@ -76,7 +76,7 @@ struct vfsmount {
> static inline struct mnt_idmap *mnt_idmap(const struct vfsmount *mnt)
> {
> /* Pairs with smp_store_release() in do_idmap_mount(). */
> - return READ_ONCE(mnt->mnt_idmap);
> + return smp_load_consume_ptr(mnt->mnt_idmap);
> }
>
> extern int mnt_want_write(struct vfsmount *mnt);

These would have the same semantics. And in v6.12, this is instead
smp_load_acquire().

Thanx, Paul