Re: Linux Cluster using shared scsi

From: Eric Z. Ayers (Eric.Ayers@intec-telecom-systems.com)
Date: Tue May 01 2001 - 16:07:57 EST

Next message: Linus Torvalds: "Re: isa_read/write not available on ppc - solution suggestions ??"
Previous message: Alan Cox: "Re: Maximum files per Directory"
In reply to: Alan Cox: "Re: Linux Cluster using shared scsi"
Next in thread: Alan Cox: "Re: Linux Cluster using shared scsi"
Reply: Alan Cox: "Re: Linux Cluster using shared scsi"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Alan Cox writes:
> > Does this package also tell the kernel to "re-establish" a
> > reservation for all devices after a bus reset, or at least inform a
> > user level program? Finding out when there has been a bus reset has
> > been a stumbling block for me.
>
> You cannot rely on a bus reset. Imagine hot swap disks on an FC fabric. I
> suspect the controller itself needs to call back for problem events
>

I'm not an SCSI expert by any stretch of the imagination. I think
that what you are saying is that you cannot rely that a bus reset is
as only thing that will remove a reservation. For example, if a
device is 'hot replaced', the device will (clearly) no longer be
reserved. But if you did such a hot swap you would have "bigger
fish to fry" in a HA application... I mean, none of your data would be
there!

My understanding is that specifically, when a bus reset occurs, all
SCSI reservations for devices on that bus are lost. I was hoping that
if the kernel (by this I mean the scsi midlayer) was maintaining
reservations, that there would be some logic activated to "handle"
this problem, whether it be re-reserving the device, or the ability to
pass notification of a reset (or another problem event as you point
out) up to the application that's handling reservations.

In my experience, the most common reason for a bus reset in parallel
SCSI is that a peer host on the bus is rebooting. Since this happens
under normal operation and well in advance of any attempt to acess the
device, it would be nice if there were some sort of asyncronous
notification instead of a polling process with an interval of 2-3
minutes, where it's conceivable that the peer system could have booted
and attempted to take-over the disk out from under a running system.

Bus resets in the Linux drivers also tend to happen frequently when a
disk is failing, which has tended to leave the system in a somewhat
functional but often an unusable state, (but that's a different story...)

James Bottomley <James.Bottomley@steeleye.com> writes:
> Essentially, there are many conditions which cause a quiet loss of a SCSI-2
> reservation. Even in parallel SCSI: Reservations can be silently lost because
>of LUN reset, device reset or even simple powering off the device.
...

James mentions that even handling a bus reset still leaves a window
where a peer could grab the reservation out from underneath an
un-suspecting host. I agree that this could happen, and the old host
might perform writes to an 'unreserved' disk, but once the second
system suceeded in obtaining the reservation, any read/write commands
from the "old" host would return SCSI errors (this is my layman's
understanding - the commands would return a UNIT_RESERVED error) , so
I believe you would have the desired behavior in this kind of cluster
- only one machine in the cluster can access the disk at the same
time. The data on the disk should be in a state where the second
system in the cluster could start a recovery task and begin to provide
the service hosted on the disk.

-Eric.

--
Eric Z. Ayers				              Lead Software Engineer
Phone:  +1 404-705-2864                    Computer Generation, Incorporated
Fax:    +1 404-705-2805                     an Intec Telecom Systems Company
Web:    http://www.intec-telecom-systems.com/
Email:  eric.ayers@intec-telecom-systems.com
Postal: Bldg G 4th Floor, 5775 Peachtree-Dunwoody Rd, Atlanta, GA 30342 USA
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: isa_read/write not available on ppc - solution suggestions ??"
Previous message: Alan Cox: "Re: Maximum files per Directory"
In reply to: Alan Cox: "Re: Linux Cluster using shared scsi"
Next in thread: Alan Cox: "Re: Linux Cluster using shared scsi"
Reply: Alan Cox: "Re: Linux Cluster using shared scsi"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon May 07 2001 - 21:00:11 EST