Re: [PATCH v4] scsi: libsas: Fix the failure of adding phy with zero-address to port

From: yangxingui
Date: Mon Dec 04 2023 - 06:49:13 EST


Hi, John

On 2023/12/1 17:22, John Garry wrote:
On 30/11/2023 03:53, yangxingui wrote:

For phy19, when the phy is attached and added to the parent wide port, the path is:
sas_rediscover()
     ->sas_discover_new()
         ->sas_ex_discover_devices()
             ->sas_ex_discover_dev()
                 -> sas_add_parent_port().

ok, so then the change to set ex_phy->port = ex->parent_port looks ok. Maybe we can put this in a helper with the sas_port_add_phy() call, as it is duplicated in sas_ex_join_wide_port()

Do we also need to set ex_phy->phy_state (like sas_ex_join_wide_port())?

Well, okay, as follows?
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -856,9 +856,7 @@ static bool sas_ex_join_wide_port(struct domain_device *parent, int phy_id)

                 if (!memcmp(phy->attached_sas_addr, ephy->attached_sas_addr,
                             SAS_ADDR_SIZE) && ephy->port) {
-                       sas_port_add_phy(ephy->port, phy->phy);
-                       phy->port = ephy->port;
-                       phy->phy_state = PHY_DEVICE_DISCOVERED;
+                       sas_port_add_ex_phy(ephy->port, phy);
                         return true;

this looks ok. How about adding this helper and using it in a separate change?
Okay, then I will update the version.

                 }
         }
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index e860d5b19880..39ffa60a9a01 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -189,6 +189,13 @@ static inline void sas_phy_set_target(struct asd_sas_phy *p, struct domain_devic
         }
  }

+static inline void sas_port_add_ex_phy(struct sas_port *port, struct ex_phy *ex_phy)
+{
+       sas_port_add_phy(port, ex_phy->phy);
+       ex_phy->port = port;
+       ex_phy->phy_state = PHY_DEVICE_DISCOVERED;
+}

I'd prefer sas_expander.c, but sas_add_parent_port() is here... having said that, sas_add_parent_port() is only used in sas_expander.c
Okay, then I will update the version and move it to sas_expander.c .


+
  static inline void sas_add_parent_port(struct domain_device *dev, int phy_id)
  {
         struct expander_device *ex = &dev->ex_dev;
@@ -201,8 +208,7 @@ static inline void sas_add_parent_port(struct domain_device *dev, int phy_id)
                 BUG_ON(sas_port_add(ex->parent_port));
                 sas_port_mark_backlink(ex->parent_port);
         }
-       sas_port_add_phy(ex->parent_port, ex_phy->phy);
+       sas_port_add_ex_phy(ex->parent_port, ex_phy);
  }


And the path called when it is removed from parent wide port is:
sas_rediscover()
     ->sas_unregister_devs_sas_addr() // The sas address of phy19 becomes 0. Since ex_phy->port is NULL, phy19 is not removed from the parent wide port's phy_list.

For phy0, it is connected to a new sata device.
sas_rediscover()
     ->sas_discover_new()->sas_ex_phy_discover()
                             ->sas_ex_phy_discover_helper()
                                 ->sas_set_ex_phy() // The device type is stp. Since the linkrate is 5 and less than 1.5G, sas_address is set to 0.

Then when we get the proper linkrate later, will we then rediscover and set the proper SAS address? I am just wondering if this change is really required?
Yes, but in fact it has not reached that stage yet. After setting the address to 0, it will continue to create a new port and try to add other phys with the same address as it to this new port.

creating a port for SAS address == 0 and adding phys seems incorrect, right?
Yes. There are three possible ways to solve the problem of creating a port with a zero address:
1. Use the sas address obtained by querying the expander instead of the zero address.
2. Forbid the phy with an address of 0 to create a port.
3. When the rate is less than 1.5G, do not let it enter sas_ex_discover_end_dev().

Because when the device type is not empty, its SAS address is legal, and we are currently using the first one.



BTW, Even with the change to set ex_phy->port = ex->parent_port, are we still joining the host-attached expander phy (19) to a port with SAS address == 0?
Yes, in order to avoid this situation, in the current patch, we will not force the SAS address to be set to 0 when the device type is not NULL, but will still use the address obtained after requesting the expander.

ok, let me check that again later today.
OK.

Thanks
Xingui