Re: [PATCH 1/2] kernfs: replace the mutex in kernfs_iop_permission with a rwlock

From: Fox Chen
Date: Thu Dec 03 2020 - 01:35:43 EST


Hi,

Thanks for your comments.

> On Wed, Dec 02, 2020 at 10:58:36PM +0800, Fox Chen wrote:
> > @@ -121,7 +121,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
> > if (!kn)
> > return -EINVAL;
> >
> > - mutex_lock(&kernfs_mutex);
> > + write_lock(&kn->iattr_rwlock);
> > error = setattr_prepare(dentry, iattr);
> > if (error)
> > goto out;
> > @@ -134,7 +134,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
> > setattr_copy(inode, iattr);
> >
> > out:
> > - mutex_unlock(&kernfs_mutex);
> > + write_unlock(&kn->iattr_rwlock);
> > return error;
> > }
>
> This is putting GFP_KERNEL allocation inside a rwlock. Can you please test
> with debug options including LOCKDEP and DEBUG_ATOMIC_SLEEP turned on?
>

Ok, I will try that.

Allocation is protected by the write_lock, only one thread can enter
this at a time. It should give the same protection as a mutex, right??
Or am I missing something here?? Any caveat?

On Thu, Dec 3, 2020 at 2:37 AM Tejun Heo <tj@xxxxxxxxxx> wrote:
>
> On Wed, Dec 02, 2020 at 10:58:36PM +0800, Fox Chen wrote:
> > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
> > index 89f6a4214a70..545cdb39b34b 100644
> > --- a/include/linux/kernfs.h
> > +++ b/include/linux/kernfs.h
> > @@ -156,6 +156,7 @@ struct kernfs_node {
> > unsigned short flags;
> > umode_t mode;
> > struct kernfs_iattrs *iattr;
> > + rwlock_t iattr_rwlock;
> > };
>
> Also, while this might not look like much, kernfs_node is very size
> sensitive. There are systems with huge number of these nodes, so I don't
> think putting a per-node lock like this is a good idea. Either we can use a
> shared iattr protecting lock or play some cmpxchg games when allocating and
> setting ->iattr and put the lock there.
>

Initially, I tried to put rwlock in kn->iattr, but
__kernfs_setattr(kn, iattr) needs lock protection and kn->iattr may
not exist before calling __kernfs_setattr. It's a chicken-egg paradox.
:)
It's hard to solve. cmpxchg can help, but who sets kn->iattr first
should be clearly defined.

What about I used a global shared rwlock to protect all kn->iattr.
It's easier to implement and I think we read sysfs more than write to
it, I guess it won't be that slow compared to one kn per lock?


thanks,
fox