Re: [PATCH 07/13] libmultipath: Add delayed removal support

From: Nilay Shroff

Date: Fri Apr 10 2026 - 03:07:11 EST

On 4/9/26 6:30 PM, John Garry wrote:

On 09/04/2026 07:37, Nilay Shroff wrote:

You mean a common blktests testcase, right?

For NVMe, that test would:
a. try to remove NVMe ko when we have the delayed removal active
b. ensure that we can queue for no path

I suppose that a common testcase could be possible (with dm mpath), but doesn't dm have its own testsuite?

Yes, I'd add a blktest for 'queue_if_no_path' feature. But as we know we have
separate test suite for dm under blktests, I'd first target nvme testcase and
then later add another testcase for dm-multipath.

Testing a. is a challenge to be effective, as we would typically not be able to remove the nvme modules anyway due to many other references.

For b, how about something like the following:

set_conditions() {
    _set_nvme_trtype "$@"
}

_delayed_nvme_reconnect_ctrl() {
    sleep 2
    _nvme_connect_subsys
}

test() {
    echo "Running ${TEST_NAME}"

    _setup_nvmet

    local nvmedev
    local ns
    local bytes_written

    _nvmet_target_setup
    _nvme_connect_subsys

    # Part a: Ensure writes fail when no path returns
    nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
    ns=$(_find_nvme_ns "${def_subsys_uuid}")
    echo 10 > "/sys/block/"$ns"/delayed_removal_secs"
    bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
    if [ "$bytes_written" != 4096 ]; then
        echo "could not write successfully initially"
    fi
    sleep 1
    _nvme_disconnect_ctrl "${nvmedev}"
    sleep 1
    ns=$(_find_nvme_ns "${def_subsys_uuid}")
    if [[ "${ns}" = "" ]]; then
        echo "could not find ns after disconnect"
    fi
    bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
    if [ "$bytes_written" == 4096 ]; then
        echo "wrote successfully after disconnect"
    fi
    sleep 10
    ns=$(_find_nvme_ns "${def_subsys_uuid}")
    if [[ !"${ns}" = "" ]]; then
        echo "found ns after delayed removal"
    fi

    #echo "now part 2"
    # Part b: Ensure writes work for intermittent disconnect
    _nvme_connect_subsys

    nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
    ns=$(_find_nvme_ns "${def_subsys_uuid}")
    echo 10 > "/sys/block/"$ns"/delayed_removal_secs"
    bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
    if [ "$bytes_written" != 4096 ]; then
        echo "could not write successfully initially"
    fi
    sleep 1
    _nvme_disconnect_ctrl "${nvmedev}"
    sleep 1
    ns=$(_find_nvme_ns "${def_subsys_uuid}")
    if [[ "${ns}" = "" ]]; then
        echo "could not find ns after disconnect"
    fi
    _delayed_nvme_reconnect_ctrl &
    sleep 1
    bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
    if [ "$bytes_written" != 4096 ]; then
        echo "could not write successfully with reconnect"
    fi

It seems there may be a race here if we attempt to write to $ns before
the reconnect has completed in _delayed_nvme_reconnect_ctrl.

If the intention is simply to verify that the controller reconnect occurs
within the delayed removal window and test pwrite, then it may be sufficient
to:
- perform the reconnect, and
- then validate the write (pwrite) afterwards.

In that case, we could either:
- run _delayed_nvme_reconnect_ctrl in the foreground, or
- open-code the reconnect directly in the script before issuing the write.

    sleep 10
    ns=$(_find_nvme_ns "${def_subsys_uuid}")
    if [[ "${ns}" = "" ]]; then
        echo "could not find ns after delayed reconnect"
    fi

    # Final tidy-up
    echo 0 > /sys/block/"$ns"/delayed_removal_secs
    nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
    _nvme_disconnect_ctrl "${nvmedev}"
    _nvmet_target_cleanup

    echo "Test complete"
}

Otherwise overall this looks good to me.

Thanks,
--Nilay