Re: devfs persistence

From: Eduardo Horvath (eeh@turbolinux.com)
Date: Fri Apr 28 2000 - 17:57:14 EST


On Fri, 28 Apr 2000, Matthew Jacob wrote:

> > First of all Node or Port WWNs are not sufficient for this purpose. Let's
> > say you have a RAID box with two controllers. Each controllser has its
> > own WWN: WWN0 and WWN1. One of the controllers fails and needs to be
> > replaced. The new controller has a different WWN: WWN2. But it turns out
> > that the controller really wasn't bad, it just had a loose connection. So
> > it's used when a controller fails on another RAID box on the same
> > SAN. Now the original box has WWN1 and WWN2, but another box has WWN0 and
> > WWN3. The volumes are still in the original box, but now you have a new,
> > completely different set of volumes that magically appear attached to
> > WWN0.
>
> Umm, no, I can't say I entirely agree with this scenario. It's the similar to
> an ethernet card- you've moved the card with the MAC address. You have to
> delete the arp entries (if they haven't timed out) and update your bootparams
> file because the binding of platter behind the card now has a different
> address that gets to it.

Precisely.

> In the case of an FC-AL disk (no raid box), you have something like
>
> 200000ABCDEFG Node WWN, lun 0
> 210000ABCDEFG Port A WWN, lun 0
> 220000ABCDEFG Port B WWN, lun 0
>
> In the case of one putative RAID box, you have
>
> 200000ABCDEFG Node WWN, luns 0..N
> 200100ABCDEFG Port A1 WWN, luns 0..N
> 200200ABCDEFG Port B1 WWN, luns 0..N
> 210100ABCDEFG Port A2 WWN, luns 0..N
> 210200ABCDEFG Port B2 WWN, luns 0..N
>
> and so on.. (haven't we argued about this before?)

Yes, we've argued this before. What you've described is a single
controller/4-port RAID box. That has a single, shared cache.

The more common implementation is a multiple controller RAID box where
each controller has its own cache. That would look something like this:

200000ABCDEFG Node A WWN, luns 0..N
200100ABCDEFG Port A1 WWN, luns 0..N
200200ABCDEFG Port A2 WWN, luns 0..N
200300ABCDEFG Port A3 WWN, luns 0..N
400000DEADBEE Node B WWN, luns 0..N
400100DEADBEE Port B1 WWN, luns 0..N
400200DEADBEE Port B2 WWN, luns 0..N
400300DEADBEE Port B3 WWN, luns 0..N
etc.

The current port 1/2 or A/B stuff was developed by Seagate specifically
for dual-ported disks before anyone considered the implications on more
complicated multi-port devices.

> where luns 0..N are either the actual spindles, or some virtually defined
> storage- depends on the box.

Yes.

> So, if you change the card on the RAID box (or do this just in the class 3
> service params), you've changed the identity of all the luns (wrt to WWNs).

No. This mess resulted in an extremely annoying discussion. It's been a
while, so I can't directly quote the spec, but what it comes down to is
that NWWNs are attached to controllers, each of which can have multiple
ports.

If you have a host (computer) with two FCAs, if it is properly implemented
each FCA should have a different NWWN. Further, if each FCA has two
ports, each port on that FCA shares a NWWN but has a unique PWWN. Sharing
a NWWN across those ports does not conform to the spec, and is extremely
difficult to implement if you have two different types of FCAs, each with
a different driver.

You might be able to implement it another way, but it would not be
correct. And all RAID boxes I'm aware of behave this way so you should be
prepared to handle this.

> Actual VPID of each lun is a separate measure, which may or may not change the
> same.

VPID is supposed to be unique and tied to the platter/volume/lun/data. It
should not be following the controller.

> Actually, real life practice about this at Veritas has convinced me that, in
> fact, different systems on a SAN shouldn't even see the other systems' disks
> unless they proactively share them. Have you seen the latest T10 proposals for
> ACLs?

I don't have time to follow this anymore but the last time I looked they
seemed ugly and of questionable value.

I think this is getting a little esoteric, not to mention off topic, so
I'll stop now before I describe how to implement a storage server on a
general purpose computer. :)

Eduardo Horvath

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 30 2000 - 21:00:16 EST