[PATCH v3 00/11] Bug fixes for mdadm tests

From: Logan Gunthorpe
Date: Thu Jun 02 2022 - 14:18:30 EST


Hi,

This is the updated series with the feedback received in v2[1].

This series includes fixes to fix all the kernel panics in the mdadm
tests and some, related, sparse issues, plus some cleanup.

The first two patches are cleanup from the original series that aren't
related now but I thought are worth including.

Patches 3 through 6 fix bugs with conf->log and remove the single,
unecessary, RCU access. This cleans up some sparse errors.

Patch 7 cleans up some sparse warnings with pslot usage.

Patch 8 is a cleanup which adds an enum so that patch 9 can
fix an mdadm hang. Patch 10 also fixes an mdadm hang.

I've also included Patch 11 in this series which fixes a recent
mistake in raid5-ppl that was reported by 0day which I don't think
has been fixed yet.

This series will be followed by another series for mdadm which fixes
the segfaults and annotates some failing tests to make mdadm tests
runnable fairly reliably, but I'll wait for a stable hash for this
series to note the kernel version tested against. Following that,
v3 of my lock contention series will be sent with more confidence
of its correctness.

This series is based on the current md/md-next branch as of yesterday
(42b805af10). A git branch is available here:

https://github.com/sbates130272/linux-p2pmem md-bug_v3

Thanks,

Logan

--

Changes since v2:
* Rework the RCU changes to remove the RCU usage instead of using
it every. This makes more sense seeing most accesses do not need
RCU due to them being on the IO path, or with mddev_lock() held
and the remaining ones are on the slow path and may use
mddev_lock(). (Per Christoph)
* Collect a couple more Reviewed-by tags from Christoph

Changes since v1:
* Add a patch to move the struct r5l_log to raid5-log.h in order
to fix a compiler error with rcu_access_pointer() in versions
prior to gcc-10
* Rework r5c_is_writeback() changes to make less churn (per Christoph)
* Change some 1s to trues in rcu_dereference_protected calls (per
Christoph)
* Fix an odd hunk mistake in the RCU protection patch (per Christoph)
* Fix an inverted conditional (noticed by Donald)
* Add a patch to add an enum for the overloaded values used by
mddev->curr_resync to make the status_resync() fixes clearer
(per Christoph)

--

[1] https://lore.kernel.org/all/20220526163604.32736-1-logang@xxxxxxxxxxxx

Logan Gunthorpe (11):
md/raid5-log: Drop extern decorators for function prototypes
md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
md/raid5: Ensure array is suspended for calls to log_exit()
md/raid5-cache: Take mddev_lock in r5c_journal_mode_show()
md/raid5-cache: Drop RCU usage of conf->log
md/raid5-cache: Clear conf->log after finishing work
md/raid5-cache: Annotate pslot with __rcu notation
md: Use enum for overloaded magic numbers used by mddev->curr_resync
md: Ensure resync is reported after it starts
md: Notify sysfs sync_completed in md_reap_sync_thread()
md/raid5-ppl: Fix argument order in bio_alloc_bioset()

drivers/md/md.c | 55 +++++++++++++++-------------
drivers/md/md.h | 15 ++++++++
drivers/md/raid5-cache.c | 34 +++++++++---------
drivers/md/raid5-log.h | 77 +++++++++++++++++++---------------------
drivers/md/raid5-ppl.c | 6 ++--
drivers/md/raid5.c | 18 ++++------
6 files changed, 109 insertions(+), 96 deletions(-)


base-commit: 42b805af102471f53e3c7867b8c2b502ea4eef7e
--
2.30.2