[RFC PATCH] ceph: reduce contention in ceph_check_delayed_caps()

From: Luis Henriques
Date: Fri Jun 25 2021 - 11:46:06 EST


Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
workqueue and it can be kept looping for quite some time if caps keep being
added back to the mdsc->cap_delay_list. This may result in the watchdog
tainting the kernel with the softlockup flag.

This patch re-arranges the loop through the caps list so that it initially
removes all the caps from list, adding them to a temporary list. And then, with
less locking contention, it will eventually call the ceph_check_caps() for each
inode. Any caps added to the list in the meantime will be handled in the next
run.

Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Luis Henriques <lhenriques@xxxxxxx>
---
Hi Jeff!

So, I've not based this patch on top of your patchset that gets rid of
ceph_async_iput() so that it will make it easier to backport it for stable
kernels. Of course I'm not 100% this classifies as stable material.

Other than that, I've been testing this patch and I couldn't see anything
breaking. Let me know what you think.

(I *think* I've seen a tracker bug for this in the past but I couldn't
find it. I guess it could be added as a 'Link:' tag.)

Cheers,
--
Luis

fs/ceph/caps.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index a5e93b185515..727e41e3b939 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -4229,6 +4229,7 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
{
struct inode *inode;
struct ceph_inode_info *ci;
+ LIST_HEAD(caps_list);

dout("check_delayed_caps\n");
spin_lock(&mdsc->cap_delay_lock);
@@ -4239,19 +4240,23 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 &&
time_before(jiffies, ci->i_hold_caps_max))
break;
- list_del_init(&ci->i_cap_delay_list);
+ list_move_tail(&ci->i_cap_delay_list, &caps_list);
+ }
+ spin_unlock(&mdsc->cap_delay_lock);

+ while (!list_empty(&caps_list)) {
+ ci = list_first_entry(&caps_list,
+ struct ceph_inode_info,
+ i_cap_delay_list);
+ list_del_init(&ci->i_cap_delay_list);
inode = igrab(&ci->vfs_inode);
if (inode) {
- spin_unlock(&mdsc->cap_delay_lock);
dout("check_delayed_caps on %p\n", inode);
ceph_check_caps(ci, 0, NULL);
/* avoid calling iput_final() in tick thread */
ceph_async_iput(inode);
- spin_lock(&mdsc->cap_delay_lock);
}
}
- spin_unlock(&mdsc->cap_delay_lock);
}

/*