Re: Booting from Qlogic qla2300 fibre channel card

From: Lincoln Dale (ltd@cisco.com)
Date: Wed Apr 16 2003 - 01:56:16 EST


Hi,

At 08:18 AM 16/04/2003 +0200, Jurjen Oskam wrote:
>At work, we are looking to deploy several Linux boxes on our SAN. The
>machines will be IBM eServer xSeries 345 with Qlogic qla2340 Fibre Channel
>cards, and no internal disks.
>
>The storage array is an EMC Symmetrix model 8530. EMC created a document
>where they explain how to make such a configuration work. When they mention
>booting from a Symmetrix-provided volume, they mention the following:
>
>"If Linux loses connectivity long enough, the disks disappear from the
>system. [...] For [this reason], EMC recommends that you do not boot a
>Linux host from the EMC storage array."

in general, all OSes get rather upset if disks disappear under
them. particularly if those disks contain swap -- exactly how is the
machine meant to recover from that?

some recommendations:
  - run with the Matthew Jacob's "feral" driver rather than QLogic's driver
    it has much better error recovery
  - you may want to increase the delay of SCSI_TIMEOUT in drivers/scsi/scsi.h

in my lab here, i do a ton of work on Fibre Channel & iSCSI.
the best setup i've found is that i end up using ramfs as my root and
having lots of things in there. sure, its burns a bit of ram, but i can be
sure if i'm doing anything that could impact the i/o path, its on less
system-critical stuff. since its a lab and the things running on the hosts
aren't RAM hongs, i don't have swap either. you probably can't get away
with that, so i'd recommend doing some extensive testing pulling cables out
and seeing what happens and tuning timers to cope with it accordingly.

>When making an online configuration change on the Symmetrix (such as
>remapping volumes), it is possible for the attached hosts to experience
>a temporary error while accessing a storage array volume. For example,

are you sure this tech note will still apply with the DMX?
i'd imagine that there are still bin file changes that can cause this kind
of thing, but its something i believe EMC was addressing with the DMX.

>when changing the Symmetrix configuration, it is not uncommon for the
>RS/6000s (also attached to the SAN) to log one or two temporary
>SCSI-errors. They don't cause any problems at all, the AIX volume manager
>never notices a problem.

on RS/6000's, the rules were somewhat different. the HBAs that IBM had for
RS6Ks typically only tried to issue FLOGIs once every 30 seconds - so you
would be more likely to see timeout errors if you impacted the flow of
traffic temporarily.

cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Apr 23 2003 - 22:00:17 EST