Re: [PATCH net-next 1/2] net/smc: Introduce a specific sysctl for TEST_LINK time

From: Wen Gu
Date: Tue Sep 20 2022 - 02:23:19 EST




On 2022/9/20 12:55, dust.li wrote:

On Tue, Sep 20, 2022 at 10:53:54AM +0800, Wen Gu wrote:
SMC-R tests the viability of link by sending out TEST_LINK LLC
messages over RoCE fabric when connections on link have been
idle for a time longer than keepalive interval (testlink time).

But using tcp_keepalive_time as testlink time maybe not quite
suitable because it is default no less than two hours[1], which
is too long for single link to find peer dead. The active host
will still use peer-dead link (QP) sending messages, and can't
find out until get IB_WC_RETRY_EXC_ERR error CQEs, which takes
more time than TEST_LINK timeout (SMC_LLC_WAIT_TIME) normally.

So this patch introduces a independent sysctl for SMC-R to set
link keepalive time, in order to detect link down in time. The
default value is 30 seconds.

[1] https://www.rfc-editor.org/rfc/rfc1122#page-101

Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx>
---

/* called after lgr was removed from lgr_list */
diff --git a/net/smc/smc_llc.h b/net/smc/smc_llc.h
index 4404e52..1de9a29 100644
--- a/net/smc/smc_llc.h
+++ b/net/smc/smc_llc.h
@@ -19,6 +19,7 @@

#define SMC_LLC_WAIT_FIRST_TIME (5 * HZ)
#define SMC_LLC_WAIT_TIME (2 * HZ)
+#define SMC_LLC_TESTLINK_DEFAULT_TIME 30

I'm wondering why we don't follow the upper to macros using (30 * HZ) ?

Thanks for the reivew.

Because the value of sysctl_smcr_testlink_time is in seconds, and the value
of llc_testlink_time is jiffies.

I have thought about
1) using proc_dointvec_jiffies as sysctl's proc_handler just like TCP does.
But proc_dointvec_jiffies has no minimum limit, value 0 makes no sense for SMC testlink.
2) using proc_dointvec_ms_jiffies_minmax as proc_handler. But millisecond interval
seems expensive for SMC test link.

So, I choose to use proc_dointvec_minmax, make sysctl_smcr_testlink_time in
seconds, and convert to jiffies when assigning to llc_testlink_time.

Thanks,
Wen Gu.