Re: [PATCH v2 4/7] scsi: libsas: split the replacement of sas disks in two steps

From: Jason Yan
Date: Thu Jan 31 2019 - 20:58:59 EST




On 2019/2/1 0:38, John Garry wrote:
On 31/01/2019 10:29, John Garry wrote:
On 31/01/2019 02:04, Jason Yan wrote:


On 2019/1/31 1:22, John Garry wrote:
On 30/01/2019 08:24, Jason Yan wrote:
Now if a new device replaced a old device, the sas address will
change.

Hmmm... not if it's a SATA disk, which would have some same invented
SAS
address.


Yes, it's only for a SAS disk.

We unregister the old device and discover the new device in one
revalidation process. But after we deferred the sas_port_delete(), the
sas port is not deleted when we registering the new port and device.
The sas port cannot be added because the name of the new port is the
same as the old.

Fix this by doing the replacement in two steps. The first revalidation
only delete the old device and trigger a new revalidation. The second
revalidation discover the new device. To keep the event processing
synchronised to the original event,

This change seems ok, but please see below regarding generating the
bcast events.


Did I originally suggest this? It seems to needlessly make the code
more
complicated.


Yes, my first version was raise a new bcast event, and you said it's not
synchronised to the original event. Shall I get back to that approach?

Not sure. This patch seems to fix something closely related to that in
"scsi: libsas: fix issue of swapping two sas disks", which I will check
further.


An idea:

So, before the libsas changes to generate dynamic events, when libsas
was processing a particular event type - like a broadcast event - extra
events generated by the LLDD were discarded by libsas.

The revalidation process attempted to do all revalidation for the domain
is a single pass, which was ok. This really did not change.

However, in this revalidation pass, we also clear all expander and PHY
events.


Actually we only clean one expander and it's attached PHYs events now.

Maybe this is not the right thing to do. Maybe we should just clear a
single PHY event per pass, since we're processing each broadcast event
one-by-one.


Yes, we can do this. But I don't understand how this will fix the issue?
We have this issue now because we have to probe the sas port and/or delete the sas port out side of the disco_mutex. So for a specific PHY, we cannot add and delete at the same time inside the disco_mutex.

Today you will notice that if we remove a disk for example, many
broadcast events are generated, but only the first broadcast event
actually does any revalidation.

EOM