[PATCH 3.16.y-ckt 049/104] ocfs2/dlm: fix deadlock when dispatch assert master

From: Luis Henriques
Date: Mon Oct 26 2015 - 09:52:48 EST


3.16.7-ckt19 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Joseph Qi <joseph.qi@xxxxxxxxxx>

commit 012572d4fc2e4ddd5c8ec8614d51414ec6cae02a upstream.

The order of the following three spinlocks should be:
dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock

But dlm_dispatch_assert_master() is called while holding
dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls
dlm_grab() which will take dlm_domain_lock.

Once another thread (for example, dlm_query_join_handler) has already
taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock
happens.

Signed-off-by: Joseph Qi <joseph.qi@xxxxxxxxxx>
Cc: Joel Becker <jlbec@xxxxxxxxxxxx>
Cc: Mark Fasheh <mfasheh@xxxxxxxx>
Cc: "Junxiao Bi" <junxiao.bi@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
---
fs/ocfs2/dlm/dlmmaster.c | 9 ++++++---
fs/ocfs2/dlm/dlmrecovery.c | 8 ++++++--
2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 7b9f96899812..189e7e1b7144 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -1450,6 +1450,7 @@ int dlm_master_request_handler(struct o2net_msg *msg, u32 len, void *data,
int found, ret;
int set_maybe;
int dispatch_assert = 0;
+ int dispatched = 0;

if (!dlm_grab(dlm))
return DLM_MASTER_RESP_NO;
@@ -1656,14 +1657,17 @@ send_response:
mlog(ML_ERROR, "failed to dispatch assert master work\n");
response = DLM_MASTER_RESP_ERROR;
dlm_lockres_put(res);
- } else
+ } else {
+ dispatched = 1;
dlm_lockres_grab_inflight_worker(dlm, res);
+ }
} else {
if (res)
dlm_lockres_put(res);
}

- dlm_put(dlm);
+ if (!dispatched)
+ dlm_put(dlm);
return response;
}

@@ -2083,7 +2087,6 @@ int dlm_dispatch_assert_master(struct dlm_ctxt *dlm,


/* queue up work for dlm_assert_master_worker */
- dlm_grab(dlm); /* get an extra ref for the work item */
dlm_init_work_item(dlm, item, dlm_assert_master_worker, NULL);
item->u.am.lockres = res; /* already have a ref */
/* can optionally ignore node numbers higher than this node */
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 45067faf5695..5084ce856879 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -1687,6 +1687,7 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
unsigned int hash;
int master = DLM_LOCK_RES_OWNER_UNKNOWN;
u32 flags = DLM_ASSERT_MASTER_REQUERY;
+ int dispatched = 0;

if (!dlm_grab(dlm)) {
/* since the domain has gone away on this
@@ -1708,15 +1709,18 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
mlog_errno(-ENOMEM);
/* retry!? */
BUG();
- } else
+ } else {
+ dispatched = 1;
__dlm_lockres_grab_inflight_worker(dlm, res);
+ }
} else /* put.. incase we are not the master */
dlm_lockres_put(res);
spin_unlock(&res->spinlock);
}
spin_unlock(&dlm->spinlock);

- dlm_put(dlm);
+ if (!dispatched)
+ dlm_put(dlm);
return master;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/