Re: [PATCH v2] md: split MD_RECOVERY_NEEDED out of mddev_resume

From: Song Liu
Date: Thu Dec 07 2023 - 13:24:56 EST


On Wed, Dec 6, 2023 at 6:08 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> New mddev_resume() calls are added to synchronize IO with array
> reconfiguration, however, this introduces a performance regression while
> adding it in md_start_sync():
>
> 1) someone sets MD_RECOVERY_NEEDED first;
> 2) daemon thread grabs reconfig_mutex, then clears MD_RECOVERY_NEEDED and
> queues a new sync work;
> 3) daemon thread releases reconfig_mutex;
> 4) in md_start_sync
> a) check that there are spares that can be added/removed, then suspend
> the array;
> b) remove_and_add_spares may not be called, or called without really
> add/remove spares;
> c) resume the array, then set MD_RECOVERY_NEEDED again!
>
> Loop between 2 - 4, then mddev_suspend() will be called quite often, for
> consequence, normal IO will be quite slow.
>
> Fix this problem by don't set MD_RECOVERY_NEEDED again in md_start_sync(),
> hence the loop will be broken.
>
> Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration")
> Suggested-by: Song Liu <song@xxxxxxxxxx>
> Reported-by: Janpieter Sollie <janpieter.sollie@xxxxxxxxx>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218200
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>

Thanks for the fix! I added a comment and applied it to md-fixes.

Song