Re: [PATCH -next] md/raid1: fix data corruption for degraded array with slow disk

From: Song Liu
Date: Thu Aug 15 2024 - 16:50:06 EST

Next message: Conor Dooley: "Re: [RFC PATCH 06/11] dt-bindings: soc: microchip: document the two simple-mfd syscons on PolarFire SoC"
Previous message: Bjorn Andersson: "Re: [PATCH 0/2] soc: qcom: pd_mapper: Add X1E80100 and older platforms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Aug 3, 2024 at 2:15 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> read_balance() will avoid reading from slow disks as much as possible,
> however, if valid data only lands in slow disks, and a new normal disk
> is still in recovery, unrecovered data can be read:
>
> raid1_read_request
> read_balance
> raid1_should_read_first
> -> return false
> choose_best_rdev
> -> normal disk is not recovered, return -1
> choose_bb_rdev
> -> missing the checking of recovery, return the normal disk
> -> read unrecovered data
>
> Root cause is that the checking of recovery is missing in
> choose_bb_rdev(). Hence add such checking to fix the problem.
>
> Also fix similar problem in choose_slow_rdev().
>
> Fixes: 9f3ced792203 ("md/raid1: factor out choose_bb_rdev() from read_balance()")
> Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
> Reported-and-tested-by: Mateusz Jończyk <mat.jonczyk@xxxxx>
> Closes: https://lore.kernel.org/all/9952f532-2554-44bf-b906-4880b2e88e3a@xxxxx/
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>

Applied to md-6.11. Thanks for the fix!

Song

Next message: Conor Dooley: "Re: [RFC PATCH 06/11] dt-bindings: soc: microchip: document the two simple-mfd syscons on PolarFire SoC"
Previous message: Bjorn Andersson: "Re: [PATCH 0/2] soc: qcom: pd_mapper: Add X1E80100 and older platforms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]