On 7/6/2023 5:07 AM, Daniel Wagner wrote:Well, this is more a technical detail while we continue to harp about 'sync' vs 'non-sync'.
Hi James,To me this is not sync vs non-sync option, it's a max_reconnects value tested for in nvmf_should_reconnect(). Which, if set to 0 (or 1), should fail if the initial connect fails.
On Sat, Jul 01, 2023 at 05:11:11AM -0700, James Smart wrote:
As much as you want to make this change to make transports "similar", I am dead set against it unless you are completing a long qualification
of the change on real FC hardware and FC-NVME devices. There is probably 1.5 yrs of testing of different race conditions that drove this change.
You cannot declare success from a simplistic toy tool such as fcloop for validation.
The original issues exist, probably have even morphed given the time from
the original change, and this will seriously disrupt the transport and any
downstream releases. So I have a very strong NACK on this change.
Yes - things such as the connect failure results are difficult to return
back to nvme-cli. I have had many gripes about the nvme-cli's behavior over
the years, especially on negative cases due to race conditions which
required retries. It still fails this miserably. The async reconnect path
solved many of these issues for fc.
For the auth failure, how do we deal with things if auth fails over time as
reconnects fail due to a credential changes ? I would think commonality of
this behavior drives part of the choice.
Alright, what do you think about the idea to introduce a new '--sync' option to
nvme-cli which forwards this info to the kernel that we want to wait for the
initial connect to succeed or fail? Obviously, this needs to handle signals too.
From what I understood this is also what Ewan would like to have
Right now max_reconnects is calculated by the ctrl_loss_tmo and reconnect_delay. So there's already a way via the cli to make sure there's only 1 connect attempt. I wouldn't mind seeing an exact cli option that sets it to 1 connection attempt w/o the user calculation and 2 value specification.Again, we do _not_ propose to change any of the default settings.
I also assume that this is not something that would be set by default in the auto-connect scripts or automated cli startup scripts.You assume correctly. That's why it'll be an additional option.