On Monday 18 November 2002 04:22 pm, Ragnar Kjørstad wrote:
> On Mon, Nov 18, 2002 at 04:11:06PM -0600, Jesse Pollard wrote:
> > > No, you need to move the IP-address from the old nfs-server to the new
> > > one. Then to the clients it will look like a regular reboot. (Check out
> > > heartbeat, at http://www.linux-ha.org/)
> > >
> > > You need to make sure that NFS is using the shared ip (the one you move
> > > around) rather than the fixed ip. (I assume you will have a fixed ip on
> > > each host in addition to the one you move around). Also, you need to
> > > put /var/lib/nfs on shared stoarage. See the archive for more details.
> >
> > It would actually be better to use two floating IP numbers. That way
> > during normal operation, both servers would be functioning simultaneously
> > (based on the shared storage on two nodes).
> >
> > Then during failover, the floating IP of the failed node is activated on
> > the remaining node (total of 3 IP numbers now, one real, two floating).
> > The NFS recovery cycle should then cause the clients to remount the
> > filesystem from the backup server.
>
> Yes, that would be better.
>
> But it would not work as described above. There are some important
> limitations here:
>
> - I assumed that /var/lib/nfs is shared. If you want two servers to
> be active at once you need a different way to share lock-data.
>
> - AFAIK there is no way for statd to service 2 IP's at once.
> It will (AFAIK) bind to both adresses, but the problem is the
> message that is sent out at startup and includes the ip of
> the local host.
>
> Neither limitation is a law of nature. They can be fixed. I think there
> is work going on to change the way locks are stored, and I'm sure the
> second problem can be solved as well.
Actually, I was thinking that each server served a different mountpoint
instead of both providing the same one.
I'm not sure how the locks currently would be provided unless the
distributed lock from the shared storage interacts with each servers statd
properly. Otherwise you will already have problems.
Second, I thought that statd didn't care about the lock requests coming
from two IP numbers. This should be no different than having two network
interfaces attached to one server (and that works under Solaris). The
client should be using the name from the IP number, not the router used
between the client and server. I view the floating IP as existing behind
a router using the real IP. Since none of the clients are using the real
IP, the naming should remain consistant (I think).
> There may be solutions out there already. E.g. maybe Lifekeeper or
> Convolo include better support for this?
I don't know.
-- ------------------------------------------------------------------------- Jesse I Pollard, II Email: pollard@navo.hpc.milAny opinions expressed are solely my own. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:25 EST