Re: [EXTERNAL] [PATCH v4 11/11] selftests: ceph: wire up Ceph reset kselftests and documentation
From: Viacheslav Dubeyko
Date: Thu May 07 2026 - 15:38:24 EST
On Thu, 2026-05-07 at 12:27 +0000, Alex Markuze wrote:
> Wire the CephFS reset test suite into the kselftest build:
>
> - Add filesystems/ceph to the top-level selftests Makefile.
> - Add the per-suite Makefile with run_validation.sh as TEST_PROGS.
> - Add the settings file (kselftest timeout).
> - Add the MAINTAINERS entry for the test directory.
> - Add README with prerequisites, usage, and troubleshooting.
>
> Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx>
> ---
> MAINTAINERS | 1 +
> fs/ceph/mds_client.c | 3 +-
> fs/ceph/mds_client.h | 1 +
> tools/testing/selftests/Makefile | 1 +
> .../selftests/filesystems/ceph/Makefile | 7 ++
> .../testing/selftests/filesystems/ceph/README | 84 +++++++++++++++++++
> .../selftests/filesystems/ceph/settings | 1 +
> 7 files changed, 97 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/filesystems/ceph/Makefile
> create mode 100644 tools/testing/selftests/filesystems/ceph/README
> create mode 100644 tools/testing/selftests/filesystems/ceph/settings
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2fb1c75afd16..bf6d973ac3fb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5905,6 +5905,7 @@ B: https://urldefense.proofpoint.com/v2/url?u=https-3A__tracker.ceph.com_&d=DwIDAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=q5bIm4AXMzc8NJu1_RGmnQ2fMWKq4Y4RAkElvUgSs00&m=cRBv-UtfmhjOUGCI7KVnj0A4bykzGklT8Ys3gWPOvguS8dyEz9b-bde2xdqTEBZi&s=wjFho6C-M1GiaFvcMzhan3RRBQEA-dyFGR2tqxUoJy0&e=
> T: git https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ceph_ceph-2Dclient.git&d=DwIDAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=q5bIm4AXMzc8NJu1_RGmnQ2fMWKq4Y4RAkElvUgSs00&m=cRBv-UtfmhjOUGCI7KVnj0A4bykzGklT8Ys3gWPOvguS8dyEz9b-bde2xdqTEBZi&s=77jK11z0NzRQJxZXDRsicogOgD_d9u7-XA5-bhb99XA&e=
> F: Documentation/filesystems/ceph.rst
> F: fs/ceph/
> +F: tools/testing/selftests/filesystems/ceph/
>
> CERTIFICATE HANDLING
> M: David Howells <dhowells@xxxxxxxxxx>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index b16638ebff7f..3b6560da8c4e 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -2359,6 +2359,7 @@ struct flush_dump_entry {
> static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid)
> {
> struct ceph_client *cl = mdsc->fsc->client;
> + int i;
> struct flush_dump_entry entries[CEPH_CAP_FLUSH_MAX_DUMP_ENTRIES];
> struct ceph_cap_flush *cf;
> int n = 0, remaining = 0;
> @@ -2388,7 +2389,7 @@ static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid)
>
> pr_info_client(cl, "still waiting for cap flushes through %llu:\n",
> want_tid);
> - for (int i = 0; i < n; i++) {
> + for (i = 0; i < n; i++) {
> struct flush_dump_entry *e = &entries[i];
>
> if (e->ci_null)
> diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
> index b1a0621cd37e..731d6ad04956 100644
> --- a/fs/ceph/mds_client.h
> +++ b/fs/ceph/mds_client.h
> @@ -121,6 +121,7 @@ static inline bool ceph_reset_is_idle(struct ceph_client_reset_state *st)
> {
> return READ_ONCE(st->phase) == CEPH_CLIENT_RESET_IDLE;
> }
> +
> struct ceph_mds_cap_match {
> s64 uid; /* default to MDS_AUTH_UID_ANY */
> u32 num_gids;
> diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
> index 6e59b8f63e41..ab254ae793a9 100644
> --- a/tools/testing/selftests/Makefile
> +++ b/tools/testing/selftests/Makefile
> @@ -32,6 +32,7 @@ TARGETS += exec
> TARGETS += fchmodat2
> TARGETS += filesystems
> TARGETS += filesystems/binderfs
> +TARGETS += filesystems/ceph
> TARGETS += filesystems/epoll
> TARGETS += filesystems/fat
> TARGETS += filesystems/overlayfs
> diff --git a/tools/testing/selftests/filesystems/ceph/Makefile b/tools/testing/selftests/filesystems/ceph/Makefile
> new file mode 100644
> index 000000000000..4ad3e8d40d90
> --- /dev/null
> +++ b/tools/testing/selftests/filesystems/ceph/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +TEST_PROGS := run_validation.sh
> +TEST_FILES := reset_stress.sh reset_corner_cases.sh \
> + validate_consistency.py README settings
> +
> +include ../../lib.mk
> diff --git a/tools/testing/selftests/filesystems/ceph/README b/tools/testing/selftests/filesystems/ceph/README
> new file mode 100644
> index 000000000000..eb0092b38f80
> --- /dev/null
> +++ b/tools/testing/selftests/filesystems/ceph/README
> @@ -0,0 +1,84 @@
> +# CephFS Client Reset Test Suite
> +
> +Test suite for the CephFS kernel client manual session reset feature.
> +This trimmed set contains the single-client stress test, the targeted
> +corner-case test, and the one-shot validation harness used during
> +feature bring-up.
> +
> +## Prerequisites
> +
> +- Linux kernel with the CephFS client reset feature (this branch)
> +- A running Ceph cluster with at least one MDS
> +- Root access (debugfs requires it)
> +- Python 3 (for validators)
> +- flock utility (for lock tests, usually in util-linux)
> +
> +## Test inventory
> +
> +| Test | Script(s) | What it covers |
> +|------|-----------|----------------|
> +| Single-client stress | `reset_stress.sh` | I/O + resets + data integrity on one mount |
> +| Corner cases | `reset_corner_cases.sh` | EBUSY, dirty caps, flock reclaim, unmount-during-reset |
> +| Validation harness | `run_validation.sh` | baseline + corner cases + moderate/aggressive stress + final status check |
> +
> +## Quick start
> +
> +Stress run:
> +
> + sudo ./reset_stress.sh --mount-point /mnt/cephfs --profile moderate
> +
> +Corner cases:
> +
> + sudo ./reset_corner_cases.sh --mount-point /mnt/cephfs
> +
> +End-to-end validation:
> +
> + sudo ./run_validation.sh --mount-point /mnt/cephfs
> +
> +## Stress profiles
> +
> + baseline - no resets, 1 IO + 1 rename, 600s
> + moderate - reset every 5-15s, 2 IO + 1 rename, 900s
> + aggressive - reset every 1-5s, 4 IO + 2 rename, 900s
> + soak - reset every 5-15s, 2 IO + 1 rename, 3600s
> +
> +## Key options (all scripts)
> +
> + --mount-point PATH CephFS mount point (required)
> + --client-id ID Debugfs client id (auto-detected if one)
> +
> +reset_stress.sh additionally accepts:
> +
> + --profile NAME baseline|moderate|aggressive|soak
> + --duration-sec N Override profile runtime
> + --no-reset Disable reset injection
> + --out-dir PATH Artifact directory
> +
> +## Corner case tests
> +
> + [1/4] ebusy_rejection Second reset rejected while first in-flight
> + [2/4] dirty_caps_at_reset Reset with unflushed dirty caps
> + [3/4] flock_after_reset Stale lock EIO + fresh lock after holder exit
> + [4/4] unmount_during_reset umount during active reset (destroy-path wakeup)
> +
> +Test 4 requires creating a second CephFS mount instance and SKIPs if
> +the host cannot do so. See `--help` output for details.
> +
> +## Troubleshooting
> +
> +**No writable Ceph reset interface found:**
> +Kernel lacks the reset feature, debugfs not mounted, or not root.
> +Check: `ls /sys/kernel/debug/ceph/*/reset/`
> +
> +**Multiple Ceph clients found:**
> +Use `--client-id` to select one.
> +List: `ls /sys/kernel/debug/ceph/`
> +
> +## Files
> +
> +| File | Role |
> +|------|------|
> +| `reset_stress.sh` | Single-client stress test runner |
> +| `validate_consistency.py` | Single-client post-run validator |
> +| `reset_corner_cases.sh` | Corner case harness (4 sequential tests) |
> +| `run_validation.sh` | One-shot validation harness |
> diff --git a/tools/testing/selftests/filesystems/ceph/settings b/tools/testing/selftests/filesystems/ceph/settings
> new file mode 100644
> index 000000000000..79b65bdf05db
> --- /dev/null
> +++ b/tools/testing/selftests/filesystems/ceph/settings
> @@ -0,0 +1 @@
> +timeout=1200
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
Thanks,
Slava.