Re: [PATCH -next 3/3] md: use interruptible apis in idle/frozen_sync_thread()

From: Yu Kuai
Date: Tue Jan 30 2024 - 02:04:26 EST


Hi,

在 2024/01/30 14:37, Song Liu 写道:
Hi,

Sorry for the late reply.

The first two patches of the set look good, so I applied them to
md-tmp-6.9 branch. However, this one needs a respin.

We are fixing dm-raid regressions, so I'll not send a new version until
that work is done. :)

On Thu, Dec 28, 2023 at 4:58 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

From: Yu Kuai <yukuai3@xxxxxxxxxx>

Before refactoring idle and frozen from action_store, interruptible apis
is used so that hungtask warning won't be triggered if it takes too long
to finish idle/frozen sync_thread. So change to use interruptible apis.

This paragraph is confusing. Please rephrase it.


In order not to make stop_sync_thread() more complicated, factor out a
helper prepare_to_stop_sync_thread() to replace stop_sync_thread().

Also return error to user if idle/frozen_sync_thread() failed, otherwise
user will be misleaded.

s/misleaded/misled/


Fixes: 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock")
Fixes: 8e8e2518fcec ("md: Close race when setting 'action' to 'idle'.")

Please add more information about what is being fixed here, so that
we can make a clear decision on whether the fix needs to be back
ported to stable kernels.

8e8e2518fcec added the interruptible apis first, however, it doesn't
return error to user if it's interrupted.

130443d60b1b deleted the interruptible apis.

Perhaps I should split this patch into two minor patches, one for each
fixed tag.


Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---
drivers/md/md.c | 105 ++++++++++++++++++++++++++++++------------------
1 file changed, 67 insertions(+), 38 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 60f99768a1a9..9ea05de79fe4 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4846,26 +4846,34 @@ action_show(struct mddev *mddev, char *page)
return sprintf(page, "%s\n", type);
}

+static bool sync_thread_stopped(struct mddev *mddev, int *sync_seq)

I think we need a comment for this.

+{
+ if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
+ return true;
+
+ if (sync_seq && *sync_seq != atomic_read(&mddev->sync_seq))
+ return true;
+
+ return false;
+}
+
/**
- * stop_sync_thread() - wait for sync_thread to stop if it's running.
+ * prepare_to_stop_sync_thread() - prepare to stop sync_thread if it's running.
* @mddev: the array.
- * @locked: if set, reconfig_mutex will still be held after this function
- * return; if not set, reconfig_mutex will be released after this
- * function return.
- * @check_seq: if set, only wait for curent running sync_thread to stop, noted
- * that new sync_thread can still start.
+ * @unlock: whether or not caller want to release reconfig_mutex if
+ * sync_thread is not running.
+ *
+ * Return true if sync_thread is running, release reconfig_mutex and do
+ * preparatory work to stop sync_thread, caller should wait for
+ * sync_thread_stopped() to return true. Return false if sync_thread is not
+ * running, reconfig_mutex will be released if @unlock is set.
*/

I found prepare_to_stop_sync_thread very hard to reason. Please try to
rephrase the comment or refactor the code. Maybe it makes sense to put
the following logic and its variations to a separate function:

if (prepare_to_stop_sync_thread(mddev, false)) {
wait_event(resync_wait, sync_thread_stopped(mddev, NULL));
mddev_lock_nointr(mddev);
}

I can do this, but there are 5 callers and only two of them can use the
separate caller. Pehaps something like this?

void stop_sync_thread(struct mddev *mddev, bool wait_sb)
{
if (prepare_to_stop_sync_thread(mddev, wait_sb)) {
wait_event(resync_wait, ...);
if (!wait_sb) {
mddev_lock_nointr(mddev);
return;
}
}

if (wait_sb) {
wait_event(sb_wait, ...);
mddev_lock_nointr(mddev);
}
}

int stop_sync_thread_interruptible(struct mddev *mddev, bool check_sync_seq)
{
..
}

Thanks,
Kuai


Thanks,
Song

-static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq)
+static bool prepare_to_stop_sync_thread(struct mddev *mddev, bool unlock)
{
- int sync_seq;

[...]
.