Re: [PATCH v2 4/7] scsi: libsas: split the replacement of sas disks in two steps

From: John Garry
Date: Fri Feb 01 2019 - 04:34:40 EST


On 01/02/2019 01:58, Jason Yan wrote:


On 2019/2/1 0:38, John Garry wrote:
On 31/01/2019 10:29, John Garry wrote:
On 31/01/2019 02:04, Jason Yan wrote:


On 2019/1/31 1:22, John Garry wrote:
On 30/01/2019 08:24, Jason Yan wrote:
Now if a new device replaced a old device, the sas address will
change.

Hmmm... not if it's a SATA disk, which would have some same invented
SAS
address.


Yes, it's only for a SAS disk.

We unregister the old device and discover the new device in one
revalidation process. But after we deferred the sas_port_delete(),
the
sas port is not deleted when we registering the new port and device.
The sas port cannot be added because the name of the new port is the
same as the old.

Fix this by doing the replacement in two steps. The first
revalidation
only delete the old device and trigger a new revalidation. The second
revalidation discover the new device. To keep the event processing
synchronised to the original event,

This change seems ok, but please see below regarding generating the
bcast events.


Did I originally suggest this? It seems to needlessly make the code
more
complicated.


Yes, my first version was raise a new bcast event, and you said it's
not
synchronised to the original event. Shall I get back to that approach?

Not sure. This patch seems to fix something closely related to that in
"scsi: libsas: fix issue of swapping two sas disks", which I will check
further.


An idea:

So, before the libsas changes to generate dynamic events, when libsas
was processing a particular event type - like a broadcast event - extra
events generated by the LLDD were discarded by libsas.

The revalidation process attempted to do all revalidation for the domain
is a single pass, which was ok. This really did not change.

However, in this revalidation pass, we also clear all expander and PHY
events.


Actually we only clean one expander and it's attached PHYs events now.

ok, fine, it's just for one expander; but we still do clear that one expanders events fully.

However we would have to be careful here to ensure that we don't have a situation where we still have PHY events pending but no broadcast events to trigger the revalidation and subsequent processing.


Maybe this is not the right thing to do. Maybe we should just clear a
single PHY event per pass, since we're processing each broadcast event
one-by-one.


Yes, we can do this. But I don't understand how this will fix the issue?

It would solve the problem of having to fixup the expanders events = -1, which I mentioned was not so nice.

As for fixing the main problem, I was not against the idea of the other change in sas_rediscover_dev() to not call sas_discover_new() when the SAS address has changed.

We have this issue now because we have to probe the sas port and/or
delete the sas port out side of the disco_mutex. So for a specific PHY,
we cannot add and delete at the same time inside the disco_mutex.

Today you will notice that if we remove a disk for example, many
broadcast events are generated, but only the first broadcast event
actually does any revalidation.

EOM




.