[RFC] md/raid1: possible deadlock introduced in raid1_read_request()

From: Abd-Alrhman Masalkhi

Date: Sat Apr 25 2026 - 10:29:57 EST


hi,

The raid1_read_request() function splits a bio regardless of whether it
is an original bio or an md-cloned bio (regardless of the r1bio_existed
value).

When an md_cloned_bio is resubmitted, raid1_read_request() treats it as a
new original bio instead of recognizing it as an md_cloned_bio.

If I understand this correctly, this results in allocating a new r1bio,
etc. More importantly, this may lead to a deadlock if we tried to
suspended the array before the md driver calls percpu_ref_tryget_live()
(&mddev->active_io) on the path down to pers->make_request().

I am considering two possible approaches, ending the bio if max_sectors
is smaller than bio_sectors(bio), or modifing read_balance() to select
a disk that can handle the full bio if r1bio_existed was set.

Is this understanding correct? and What might be the preferred approach
in this case?

Thanks,
Abd-Alrhman