[PATCH v4 11/11] selftests: ceph: wire up Ceph reset kselftests and documentation

From: Alex Markuze

Date: Thu May 07 2026 - 08:32:47 EST


Wire the CephFS reset test suite into the kselftest build:

- Add filesystems/ceph to the top-level selftests Makefile.
- Add the per-suite Makefile with run_validation.sh as TEST_PROGS.
- Add the settings file (kselftest timeout).
- Add the MAINTAINERS entry for the test directory.
- Add README with prerequisites, usage, and troubleshooting.

Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx>
---
MAINTAINERS | 1 +
fs/ceph/mds_client.c | 3 +-
fs/ceph/mds_client.h | 1 +
tools/testing/selftests/Makefile | 1 +
.../selftests/filesystems/ceph/Makefile | 7 ++
.../testing/selftests/filesystems/ceph/README | 84 +++++++++++++++++++
.../selftests/filesystems/ceph/settings | 1 +
7 files changed, 97 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/filesystems/ceph/Makefile
create mode 100644 tools/testing/selftests/filesystems/ceph/README
create mode 100644 tools/testing/selftests/filesystems/ceph/settings

diff --git a/MAINTAINERS b/MAINTAINERS
index 2fb1c75afd16..bf6d973ac3fb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5905,6 +5905,7 @@ B: https://tracker.ceph.com/
T: git https://github.com/ceph/ceph-client.git
F: Documentation/filesystems/ceph.rst
F: fs/ceph/
+F: tools/testing/selftests/filesystems/ceph/

CERTIFICATE HANDLING
M: David Howells <dhowells@xxxxxxxxxx>
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index b16638ebff7f..3b6560da8c4e 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2359,6 +2359,7 @@ struct flush_dump_entry {
static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid)
{
struct ceph_client *cl = mdsc->fsc->client;
+ int i;
struct flush_dump_entry entries[CEPH_CAP_FLUSH_MAX_DUMP_ENTRIES];
struct ceph_cap_flush *cf;
int n = 0, remaining = 0;
@@ -2388,7 +2389,7 @@ static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid)

pr_info_client(cl, "still waiting for cap flushes through %llu:\n",
want_tid);
- for (int i = 0; i < n; i++) {
+ for (i = 0; i < n; i++) {
struct flush_dump_entry *e = &entries[i];

if (e->ci_null)
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index b1a0621cd37e..731d6ad04956 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -121,6 +121,7 @@ static inline bool ceph_reset_is_idle(struct ceph_client_reset_state *st)
{
return READ_ONCE(st->phase) == CEPH_CLIENT_RESET_IDLE;
}
+
struct ceph_mds_cap_match {
s64 uid; /* default to MDS_AUTH_UID_ANY */
u32 num_gids;
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 6e59b8f63e41..ab254ae793a9 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -32,6 +32,7 @@ TARGETS += exec
TARGETS += fchmodat2
TARGETS += filesystems
TARGETS += filesystems/binderfs
+TARGETS += filesystems/ceph
TARGETS += filesystems/epoll
TARGETS += filesystems/fat
TARGETS += filesystems/overlayfs
diff --git a/tools/testing/selftests/filesystems/ceph/Makefile b/tools/testing/selftests/filesystems/ceph/Makefile
new file mode 100644
index 000000000000..4ad3e8d40d90
--- /dev/null
+++ b/tools/testing/selftests/filesystems/ceph/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+
+TEST_PROGS := run_validation.sh
+TEST_FILES := reset_stress.sh reset_corner_cases.sh \
+ validate_consistency.py README settings
+
+include ../../lib.mk
diff --git a/tools/testing/selftests/filesystems/ceph/README b/tools/testing/selftests/filesystems/ceph/README
new file mode 100644
index 000000000000..eb0092b38f80
--- /dev/null
+++ b/tools/testing/selftests/filesystems/ceph/README
@@ -0,0 +1,84 @@
+# CephFS Client Reset Test Suite
+
+Test suite for the CephFS kernel client manual session reset feature.
+This trimmed set contains the single-client stress test, the targeted
+corner-case test, and the one-shot validation harness used during
+feature bring-up.
+
+## Prerequisites
+
+- Linux kernel with the CephFS client reset feature (this branch)
+- A running Ceph cluster with at least one MDS
+- Root access (debugfs requires it)
+- Python 3 (for validators)
+- flock utility (for lock tests, usually in util-linux)
+
+## Test inventory
+
+| Test | Script(s) | What it covers |
+|------|-----------|----------------|
+| Single-client stress | `reset_stress.sh` | I/O + resets + data integrity on one mount |
+| Corner cases | `reset_corner_cases.sh` | EBUSY, dirty caps, flock reclaim, unmount-during-reset |
+| Validation harness | `run_validation.sh` | baseline + corner cases + moderate/aggressive stress + final status check |
+
+## Quick start
+
+Stress run:
+
+ sudo ./reset_stress.sh --mount-point /mnt/cephfs --profile moderate
+
+Corner cases:
+
+ sudo ./reset_corner_cases.sh --mount-point /mnt/cephfs
+
+End-to-end validation:
+
+ sudo ./run_validation.sh --mount-point /mnt/cephfs
+
+## Stress profiles
+
+ baseline - no resets, 1 IO + 1 rename, 600s
+ moderate - reset every 5-15s, 2 IO + 1 rename, 900s
+ aggressive - reset every 1-5s, 4 IO + 2 rename, 900s
+ soak - reset every 5-15s, 2 IO + 1 rename, 3600s
+
+## Key options (all scripts)
+
+ --mount-point PATH CephFS mount point (required)
+ --client-id ID Debugfs client id (auto-detected if one)
+
+reset_stress.sh additionally accepts:
+
+ --profile NAME baseline|moderate|aggressive|soak
+ --duration-sec N Override profile runtime
+ --no-reset Disable reset injection
+ --out-dir PATH Artifact directory
+
+## Corner case tests
+
+ [1/4] ebusy_rejection Second reset rejected while first in-flight
+ [2/4] dirty_caps_at_reset Reset with unflushed dirty caps
+ [3/4] flock_after_reset Stale lock EIO + fresh lock after holder exit
+ [4/4] unmount_during_reset umount during active reset (destroy-path wakeup)
+
+Test 4 requires creating a second CephFS mount instance and SKIPs if
+the host cannot do so. See `--help` output for details.
+
+## Troubleshooting
+
+**No writable Ceph reset interface found:**
+Kernel lacks the reset feature, debugfs not mounted, or not root.
+Check: `ls /sys/kernel/debug/ceph/*/reset/`
+
+**Multiple Ceph clients found:**
+Use `--client-id` to select one.
+List: `ls /sys/kernel/debug/ceph/`
+
+## Files
+
+| File | Role |
+|------|------|
+| `reset_stress.sh` | Single-client stress test runner |
+| `validate_consistency.py` | Single-client post-run validator |
+| `reset_corner_cases.sh` | Corner case harness (4 sequential tests) |
+| `run_validation.sh` | One-shot validation harness |
diff --git a/tools/testing/selftests/filesystems/ceph/settings b/tools/testing/selftests/filesystems/ceph/settings
new file mode 100644
index 000000000000..79b65bdf05db
--- /dev/null
+++ b/tools/testing/selftests/filesystems/ceph/settings
@@ -0,0 +1 @@
+timeout=1200
--
2.34.1