Re: [PATCH] md: ensure consistent action state in md_do_sync

From: Li Nan
Date: Mon Sep 01 2025 - 03:19:00 EST




在 2025/9/1 10:16, Li Nan 写道:


在 2025/8/30 17:51, Paul Menzel 写道:
Dear Nan,


Thank you for your patch.

Am 30.08.25 um 11:05 schrieb linan666@xxxxxxxxxxxxxxx:
From: Li Nan <linan122@xxxxxxxxxx>

The 'mddev->recovery' flags can change during md_do_sync(), leading to
inconsistencies. For example, starting with MD_RECOVERY_RECOVER and
ending with MD_RECOVERY_SYNC can cause incorrect offset updates.

Can you give a concrete example?


T1                    T2
md_do_sync
 action = ACTION_RECOVER
                    (write sysfs)
                    action_store
                     set MD_RECOVERY_SYNC
 [ do recovery ]
 update resync_offset

The corresponding code is:
```
        if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
            mddev->curr_resync > MD_RESYNC_ACTIVE) {
                if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { ->SYNC is set, But what we do is recovery
                        if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
                                if (mddev->curr_resync >= mddev->resync_offset) {
                                        pr_debug("md: checkpointing %s of %s.\n",
                                                 desc, mdname(mddev));
                                        if (test_bit(MD_RECOVERY_ERROR,
                                                &mddev->recovery))
                                                mddev->resync_offset =

mddev->curr_resync_completed;
                                        else
                                                mddev->resync_offset =
                                                        mddev->curr_resync;
                                }
```

To avoid this, use the 'action' determined at the beginning of the
function instead of repeatedly checking 'mddev->recovery'.

Do you have a reproducer?


I don't have a reproducer because reproducing it requires modifying the
kernel. The approximate steps are:

- Modify the kernel to add a delay before the above check.
- Trigger recovery by removing and adding disks.
- After recovery completes, write to the sysfs interface at the delay point
to set the sync flag.


Please ignore my previous reply — it was wrong. When MD_RECOVERY_RUNNING
is set, the recovery state should not be changed, so this is just a
cleanup. I will further improve the code about sync finish in v2.

--
Thanks,
Nan