[PATCH -next v3] md/raid5-cache: fix a deadlock in r5l_exit_log()

From: Yu Kuai
Date: Sat Jul 08 2023 - 05:19:10 EST


From: Yu Kuai <yukuai3@xxxxxxxxxx>

Commit b13015af94cf ("md/raid5-cache: Clear conf->log after finishing
work") introduce a new problem:

// caller hold reconfig_mutex
r5l_exit_log
flush_work(&log->disable_writeback_work)
r5c_disable_writeback_async
wait_event
/*
* conf->log is not NULL, and mddev_trylock()
* will fail, wait_event() can never pass.
*/
conf->log = NULL

Fix this problem by setting 'config->log' to NULL before wake_up() as it
used to be, so that wait_event() from r5c_disable_writeback_async() can
exist. In the meantime, move forward md_unregister_thread() so that
null-ptr-deref this commit fixed can still be fixed.

Fixes: b13015af94cf ("md/raid5-cache: Clear conf->log after finishing work")
Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---

Changes in v3:
- Use a different solution.

drivers/md/raid5-cache.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 47ba7d9e81e1..2eac4a50d99b 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -3168,12 +3168,15 @@ void r5l_exit_log(struct r5conf *conf)
{
struct r5l_log *log = conf->log;

- /* Ensure disable_writeback_work wakes up and exits */
- wake_up(&conf->mddev->sb_wait);
- flush_work(&log->disable_writeback_work);
md_unregister_thread(&log->reclaim_thread);

+ /*
+ * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
+ * ensure disable_writeback_work wakes up and exits.
+ */
conf->log = NULL;
+ wake_up(&conf->mddev->sb_wait);
+ flush_work(&log->disable_writeback_work);

mempool_exit(&log->meta_pool);
bioset_exit(&log->bs);
--
2.39.2