On 20 Feb 19:55, Yu Kuai wrote:
I just take a quick look, the problem looks obviously to me, see how
md_seq_show() handle the iteration.
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 465ca2af1e6e..7c7a58f618c1 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9911,8 +9911,11 @@ static int md_notify_reboot(struct notifier_block
*this,
mddev_unlock(mddev);
}
need_delay = 1;
- mddev_put(mddev);
- spin_lock(&all_mddevs_lock);
+
+ spin_lock(&all_mddevs_lock)
+ if (atomic_dec_and_test(&mddev->active))
+ __mddev_put(mddev);
+
}
spin_unlock(&all_mddevs_lock);
While cooking the patch, this is not enough, list_for_each_entry_safe()
should be replaced with list_for_each_entry() as well.
Will send the patch soon, with:
Reported-by: Guillaume Morin <guillaume@xxxxxxxxxxx>
Thank you! I just saw the patch and we are going to test it and let you
know.
The issue with the next pointer seems to be fixed with your change.
Though I am still unclear how the 2nd potential issue I mentioned -
where the current item would be freed concurrently by mddev_free() - is
prevented. I am not finding anything in the code that seems to prevent a
concurrent call to mddev_free() for the current item in the
list_for_each_entry() loop (and therefore accessing mddev after the
kfree()).
I understand that we are getting a reference through the active atomic
in mddev_get() under the lock in md_notify_reboot() but how is that
preventing mddev_free() from freeing the mddev as soon as we release the
all_mddevs_lock in the loop?
I am not not familiar with this code so I am most likely missing
osmething but if you had the time to explain, that would be very
helpful.
TIA
Guillaume.