RE: [PATCH RFC v2 4/4] 9p: fix race issue in fid contention.
From: Jianyong Wu
Date: Thu Nov 05 2020 - 02:05:45 EST
Hi Dominique,
Good News!
I'm happy to see that you have time to pick this up again. All the changes are OK for me.
Thanks
Jianyong
> -----Original Message-----
> From: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>
> Sent: Tuesday, November 3, 2020 6:41 PM
> To: Jianyong Wu <Jianyong.Wu@xxxxxxx>
> Cc: ericvh@xxxxxxxxx; lucho@xxxxxxxxxx; qemu_oss@xxxxxxxxxxxxx;
> groug@xxxxxxxx; v9fs-developer@xxxxxxxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Justin He <Justin.He@xxxxxxx>
> Subject: Re: [PATCH RFC v2 4/4] 9p: fix race issue in fid contention.
>
> Jianyong,
>
> Jianyong Wu wrote on Wed, Sep 23, 2020:
> > Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
> > bug in 9p. But there is race issue in fid comtention.
> > As Greg's patch stores all of fids from opened files into according
> > inode, so all the lookup fid ops can retrieve fid from inode
> > preferentially. But there is no mechanism to handle the fid comtention
> > issue. For example, there are two threads get the same fid in the same
> > time and one of them clunk the fid before the other thread ready to
> > discard the fid. In this scenario, it will lead to some fatal problems, even
> kernel core dump.
> >
> > I introduce a mechanism to fix this race issue. A counter field
> > introduced into p9_fid struct to store the reference counter to the
> > fid. When a fid is allocated from the inode or dentry, the counter
> > will increase, and will decrease at the end of its occupation. It is
> > guaranteed that the fid won't be clunked before the reference counter
> > go down to 0, then we can avoid the clunked fid to be used.
> >
> > tests:
> > race issue test from the old test case:
> > for file in {01..50}; do touch f.${file}; done seq 1 1000 | xargs -n 1
> > -P 50 -I{} cat f.* > /dev/null
> >
> > open-unlink-f*syscall test:
> > I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat.
> >
> > Signed-off-by: Jianyong Wu <jianyong.wu@xxxxxxx>
>
> Ok so I've finally taken some time to test -- sorry for the delay.
> I didn't bother with qemu but the use-after-close f* calls work with nfs-
> ganesha and it doesn't introduce any obvious regression with the current
> qemu either, so this is good for me.
>
> Greg, Christian - from what I understood (in private, hopefully I'm allowed to
> repeat!), he won't be able to contribute to qemu because of company
> policies and I'm unlikely to take the time either right now.
> I don't think it's a problem to continue as is though, we can land linux kernel
> support (it's still useful for non-qemu servers) and if someone is interested
> later on they'll just need to finish that bit.
>
>
> I'm not seeing any obvious problem now so I'll take these patches in -next
> within the next few days, with this extra patch on top that basically applies
> the requests I had:
> - removing the extra atomic_set in fs/9p/fid.c
> - convert from atomic to refcount API (overflow checks)
> - rename a no-composant walk to clone_fid()
>
> I'll just run some more instrumented tests first to check we're not leaking
> anything but so far I haven't found any problem.
>
> If you noticed anything else please speak up.
> Thanks for taking the time to finish this! :)
> ---
> fs/9p/fid.c | 10 ++++------
> fs/9p/fid.h | 2 +-
> include/net/9p/client.h | 2 +-
> net/9p/client.c | 4 ++--
> 4 files changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/fs/9p/fid.c b/fs/9p/fid.c
> index 89643dabcdae..50118ec72a92 100644
> --- a/fs/9p/fid.c
> +++ b/fs/9p/fid.c
> @@ -28,7 +28,6 @@
>
> static inline void __add_fid(struct dentry *dentry, struct p9_fid *fid) {
> -atomic_set(&fid->count, 1);
> hlist_add_head(&fid->dlist, (struct hlist_head *)&dentry-
> >d_fsdata); }
>
> @@ -62,7 +61,7 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode
> *inode, kuid_t uid)
> }
> }
> if (ret && !IS_ERR(ret))
> -atomic_inc(&ret->count);
> +refcount_inc(&ret->count);
> spin_unlock(&inode->i_lock);
> return ret;
> }
> @@ -77,7 +76,6 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode
> *inode, kuid_t uid) void v9fs_open_fid_add(struct inode *inode, struct
> p9_fid *fid) {
> spin_lock(&inode->i_lock);
> -atomic_set(&fid->count, 1);
> hlist_add_head(&fid->ilist, (struct hlist_head *)&inode->i_private);
> spin_unlock(&inode->i_lock);
> }
> @@ -110,7 +108,7 @@ static struct p9_fid *v9fs_fid_find(struct dentry
> *dentry, kuid_t uid, int any)
> hlist_for_each_entry(fid, h, dlist) {
> if (any || uid_eq(fid->uid, uid)) {
> ret = fid;
> -atomic_inc(&ret->count);
> +refcount_inc(&ret->count);
> break;
> }
> }
> @@ -201,7 +199,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> dentry *dentry,
> }
> /* If we are root ourself just return that */
> if (dentry->d_sb->s_root == dentry) {
> -atomic_inc(&fid->count);
> +refcount_inc(&fid->count);
> return fid;
> }
> /*
> @@ -250,7 +248,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct
> dentry *dentry,
> fid = ERR_PTR(-ENOENT);
> } else {
> __add_fid(dentry, fid);
> -atomic_inc(&fid->count);
> +refcount_inc(&fid->count);
> spin_unlock(&dentry->d_lock);
> }
> }
> diff --git a/fs/9p/fid.h b/fs/9p/fid.h
> index 1fed96546728..f7f33509e169 100644
> --- a/fs/9p/fid.h
> +++ b/fs/9p/fid.h
> @@ -28,7 +28,7 @@ static inline struct p9_fid *v9fs_fid_clone(struct dentry
> *dentry)
> if (!fid || IS_ERR(fid))
> return fid;
>
> -nfid = p9_client_walk(fid, 0, NULL, 1);
> +nfid = clone_fid(fid);
> p9_client_clunk(fid);
> return nfid;
> }
> diff --git a/include/net/9p/client.h b/include/net/9p/client.h index
> 58ed9bd306bd..e1c308d8d288 100644
> --- a/include/net/9p/client.h
> +++ b/include/net/9p/client.h
> @@ -149,7 +149,7 @@ enum fid_source {
> struct p9_fid {
> struct p9_client *clnt;
> u32 fid;
> -atomic_t count;
> +refcount_t count;
> int mode;
> struct p9_qid qid;
> u32 iounit;
> diff --git a/net/9p/client.c b/net/9p/client.c index
> a6c8a915e0d8..ba4910138c5b 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -901,7 +901,7 @@ static struct p9_fid *p9_fid_create(struct p9_client
> *clnt)
> fid->clnt = clnt;
> fid->rdir = NULL;
> fid->fid = 0;
> -atomic_set(&fid->count, 1);
> +refcount_set(&fid->count, 1);
>
> idr_preload(GFP_KERNEL);
> spin_lock_irq(&clnt->lock);
> @@ -1466,7 +1466,7 @@ int p9_client_clunk(struct p9_fid *fid)
> dump_stack();
> return 0;
> }
> -if (!atomic_dec_and_test(&fid->count))
> +if (!refcount_dec_and_test(&fid->count))
> return 0;
>
> again:
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.