[PATCH 0/2] ksmbd: fix stop_sessions() iteration and centralize ksmbd_conn release

From: DaeMyung Kang

Date: Sat Apr 25 2026 - 05:38:52 EST


Two independent fixes around ksmbd connection teardown.

Patch 1 ("ksmbd: rewrite stop_sessions() with restartable iteration")
fixes a stale-iterator bug in stop_sessions(). The current loop drops
conn_list_lock around transport->shutdown() and re-acquires it to
continue the same hash_for_each() walk; a concurrent ksmbd_conn_free()
in the unlocked window can make the iterator revisit or skip
connections. The rewrite pins one connection at a time, marks it via
a new conn->stop_called flag, runs shutdown outside the lock, and
restarts from the top. The "is the hash drained?" check is moved
into the locked walk so it no longer races hash_del() from a handler
thread. Patch 1 carries Cc: stable@xxxxxxxxxxxxxxx.

Teardown wall time, ksmbd.control --shutdown + rmmod ksmbd with
N concurrent nosharesock cifs connections (3-run mean):

N before after
10 4.93s 5.34s
30 7.34s 7.03s
50 7.25s 7.04s
100 6.98s 6.78s
200 6.77s 6.89s

Performance is unchanged; teardown is dominated by the msleep(100)
outer retry, not by the iteration. The number of ->shutdown()
calls equals the number of live connections on both paths when the
race is not artificially widened.

Patch 2 ("ksmbd: centralize ksmbd_conn final release to plug transport
leak") routes the three bare-kfree final-put sites
(ksmbd_conn_r_count_dec, __free_opinfo, session_fd_check) through a
new ksmbd_conn_put() that defers the once-per-struct cleanup
(ida_destroy, free_transport, kfree) onto a dedicated workqueue.
Those sites previously skipped transport cleanup whenever they
happened to be the last putter, leaking struct tcp_transport and the
iov kvec (TCP) and the embedded async_ida. The workqueue bounce is
needed because the centralized release reaches lock_sock_nested()
through tcp_close(), which sleeps; __free_opinfo() can be a final
putter from an RCU softirq callback (free_opinfo_rcu).

A/B leak validation, QEMU/virtme guest with ksmbd server and CIFS
client in the same guest, debug kernel (CONFIG_DEBUG_KMEMLEAK +
CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP +
CONFIG_DEBUG_OBJECTS + CONFIG_FAILSLAB):

Reproducer per iteration: hold 8 fds open via sleep processes,
force-close TCP with `ss -K sport = :445`, kill holders,
lazy-umount; 10 iterations, then ksmbd shutdown + kmemleak scan.
Pre-patch is HEAD with only patch 1 of this series applied:

state conn_alloc conn_free tcp_free opi_rcu kmemleak
---------- ---------- --------- -------- ------- --------
pre-patch 20 20 10 160 7
with patch 20 20 20 160 0

Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
bare-kfree paths skipping transport cleanup; kmemleak reports 7
unreferenced struct tcp_transport / t->iov objects. With patch 2
tcp_free matches conn_free at 20/20 and kmemleak is clean across
two independent post-patch runs. opi_rcu=160 confirms the RCU
opinfo release path that motivates the fix is exercised.

Patch 2 also addresses two adjacent issues exposed by the new
debugging context:

* __close_file_table_ids() refactor. session_fd_check() already
sleeps in kstrdup(GFP_KERNEL) and down_write(m_lock) before this
patch, but the existing code calls it under write_lock(&ft->lock)
(an rwlock_t). Refactor so skip() runs outside ft->lock with a
transient reference on fp; idr_remove(), fp->volatile_id clear
and the m_fp_list unlink happen unconditionally so a deferred
final putter (atomic_sub_and_test(2) returning false) cannot be
left to do them via __ksmbd_remove_fd() with a stale
volatile_id.

Validated on an additional debug kernel built with
CONFIG_DEBUG_LIST + CONFIG_DEBUG_OBJECTS_WORK using a
same-session two-tcon storm: one tcon drives an open/write
storm while the other tcon repeats 50 tree disconnects on the
same session. Trace counts: 52 __close_file_table_ids
invocations, 4793 __ksmbd_close_fd, 30337 __put_fd_final, 9578
ksmbd_conn_put, 1 __ksmbd_conn_release_work. No
list-corruption, work_struct ODEBUG, sleep-in-atomic, lockdep
or kmemleak reports observed. This stress validates the
file-table/id/list rewrite under DEBUG_LIST/DEBUG_OBJECTS_WORK,
not the transport leak (which the A/B above already covered).

* fp owns a strong reference on fp->conn. fp used to hold a
borrowed pointer to its connection, leaving a window where the
conn could be freed while a still-live fp held a stale
fp->conn. Make fp own a refcount on fp->conn from
ksmbd_open_fd() / ksmbd_reopen_durable_fd() until __ksmbd_close_fd()
or session_fd_check(). ksmbd_reopen_durable_fd() is also
reordered so fp->conn / fp->tcon are set before __open_id()
publishes fp into the session's file table, so a concurrent
teardown cannot observe a valid volatile_id with fp->conn ==
NULL.

The two patches are independent. Either can be reviewed, applied,
or reverted without pulling the other. Patch 2 references patch 1
only to explain why stop_sessions() is left on its open-coded local
cleanup and not converted to ksmbd_conn_put() in this series; if
patch 2 is applied alone, stop_sessions() retains its bare-kfree
leak that patch 1 separately addresses.

Based on ksmbd-for-next-next.

DaeMyung Kang (2):
ksmbd: rewrite stop_sessions() with restartable iteration
ksmbd: centralize ksmbd_conn final release to plug transport leak

fs/smb/server/connection.c | 120 ++++++++++++++++++-----
fs/smb/server/connection.h | 6 ++
fs/smb/server/oplock.c | 4 +-
fs/smb/server/server.c | 12 +++
fs/smb/server/vfs_cache.c | 189 ++++++++++++++++++++++++++++++++-----
5 files changed, 285 insertions(+), 46 deletions(-)

--
2.43.0