Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement

From: Fox Chen
Date: Tue Dec 15 2020 - 03:34:50 EST


On Mon, Dec 14, 2020 at 9:30 PM Ian Kent <raven@xxxxxxxxxx> wrote:
>
> On Mon, 2020-12-14 at 14:14 +0800, Fox Chen wrote:
> > On Sun, Dec 13, 2020 at 11:46 AM Ian Kent <raven@xxxxxxxxxx> wrote:
> > > On Fri, 2020-12-11 at 10:17 +0800, Ian Kent wrote:
> > > > On Fri, 2020-12-11 at 10:01 +0800, Ian Kent wrote:
> > > > > > For the patches, there is a mutex_lock in kn->attr_mutex, as
> > > > > > Tejun
> > > > > > mentioned here
> > > > > > (
> > > > > > https://lore.kernel.org/lkml/X8fe0cmu+aq1gi7O@xxxxxxxxxxxxxxx/
> > > > > > ),
> > > > > > maybe a global
> > > > > > rwsem for kn->iattr will be better??
> > > > >
> > > > > I wasn't sure about that, IIRC a spin lock could be used around
> > > > > the
> > > > > initial check and checked again at the end which would probably
> > > > > have
> > > > > been much faster but much less conservative and a bit more ugly
> > > > > so
> > > > > I just went the conservative path since there was so much
> > > > > change
> > > > > already.
> > > >
> > > > Sorry, I hadn't looked at Tejun's reply yet and TBH didn't
> > > > remember
> > > > it.
> > > >
> > > > Based on what Tejun said it sounds like that needs work.
> > >
> > > Those attribute handling patches were meant to allow taking the rw
> > > sem read lock instead of the write lock for kernfs_refresh_inode()
> > > updates, with the added locking to protect the inode attributes
> > > update since it's called from the VFS both with and without the
> > > inode lock.
> >
> > Oh, understood. I was asking also because lock on kn->attr_mutex
> > drags
> > concurrent performance.
> >
> > > Looking around it looks like kernfs_iattrs() is called from
> > > multiple
> > > places without a node database lock at all.
> > >
> > > I'm thinking that, to keep my proposed change straight forward
> > > and on topic, I should just leave kernfs_refresh_inode() taking
> > > the node db write lock for now and consider the attributes handling
> > > as a separate change. Once that's done we could reconsider what's
> > > needed to use the node db read lock in kernfs_refresh_inode().
> >
> > You meant taking write lock of kernfs_rwsem for
> > kernfs_refresh_inode()??
> > It may be a lot slower in my benchmark, let me test it.
>
> Yes, but make sure the write lock of kernfs_rwsem is being taken
> not the read lock.
>
> That's a mistake I had initially?
>
> Still, that attributes handling is, I think, sufficient to warrant
> a separate change since it looks like it might need work, the kernfs
> node db probably should be kept stable for those attribute updates
> but equally the existence of an instantiated dentry might mitigate
> the it.
>
> Some people might just know whether it's ok or not but I would like
> to check the callers to work out what's going on.
>
> In any case it's academic if GCH isn't willing to consider the series
> for review and possible merge.
>
Hi Ian

I removed kn->attr_mutex and changed read lock to write lock for
kernfs_refresh_inode

down_write(&kernfs_rwsem);
kernfs_refresh_inode(kn, inode);
up_write(&kernfs_rwsem);


Unfortunate, changes in this way make things worse, my benchmark runs
100% slower than upstream sysfs. :(
open+read+close a sysfs file concurrently took 1000us. (Currently,
sysfs with a big mutex kernfs_mutex only takes ~500us
for one open+read+close operation concurrently)

|--45.93%--kernfs_iop_permission
| |
| | | |
| |
| | |
|--22.55%--down_write
| |
| | | | |
| |
| | | |
--20.69%--rwsem_down_write_slowpath
| |
| | | |
|
| |
| | | |
|--8.89%--schedule

perf showed most of the time had been spent on kernfs_iop_permission


thanks,
fox