Re: [PATCH net-next v3 2/2] net/smc: handle -ENOMEM from smc_wr_alloc_link_mem gracefully
From: Paolo Abeni
Date: Thu Sep 25 2025 - 05:41:01 EST
On 9/21/25 11:44 PM, Halil Pasic wrote:
> @@ -836,27 +838,39 @@ int smcr_link_init(struct smc_link_group *lgr, struct smc_link *lnk,
> rc = smc_llc_link_init(lnk);
> if (rc)
> goto out;
> - rc = smc_wr_alloc_link_mem(lnk);
> - if (rc)
> - goto clear_llc_lnk;
> rc = smc_ib_create_protection_domain(lnk);
> if (rc)
> - goto free_link_mem;
> - rc = smc_ib_create_queue_pair(lnk);
> - if (rc)
> - goto dealloc_pd;
> + goto clear_llc_lnk;
> + do {
> + rc = smc_ib_create_queue_pair(lnk);
> + if (rc)
> + goto dealloc_pd;
> + rc = smc_wr_alloc_link_mem(lnk);
> + if (!rc)
> + break;
> + else if (rc != -ENOMEM) /* give up */
> + goto destroy_qp;
> + /* retry with smaller ... */
> + lnk->max_send_wr /= 2;
> + lnk->max_recv_wr /= 2;
> + /* ... unless droping below old SMC_WR_BUF_SIZE */
> + if (lnk->max_send_wr < 16 || lnk->max_recv_wr < 48)
> + goto destroy_qp;
If i.e. smc.sysctl_smcr_max_recv_wr == 2048, and
smc.sysctl_smcr_max_send_wr == 16, the above loop can give-up a little
too early - after the first failure. What about changing the termination
condition to:
lnk->max_send_wr < 16 && lnk->max_recv_wr < 48
and use 2 as a lower bound for both lnk->max_send_wr and lnk->max_recv_wr?
Thanks,
Paolo