Re: Ken Thompson interview in IEEE Computer magazine (fwd)

Kris Karas (ktk@ktk.bidmc.harvard.edu)
Sat, 08 May 1999 21:43:20 -0400


Ian D Romanick wrote:

> Even more important in a cluster is the ability to do a rolling upgrade.
> Shut one system down while the other one stays up. Upgrade the OS on the
> system that you shutdown and reboot. Do the same thing on the other system.
> It's (in theory) the no-downtime OS upgrade. Can Linux or Windows do that?

Yes, Linux can. Been there, done that. As long as only one of the two HA boxes
has the filesystem (stored on shared disk) mounted at any given time, the scheme
works just fine. The one major problem with this setup is that, lacking a
journaling filesystem, a full FSCK of the shared ext2 disks may need to be
performed if the failing system couldn't cleanly unmount. Maybe reiserfs (sp?)
will help here.

For this reason, I opted to set up my HA system using duplicate disk as well as
duplicate computing platforms; rsync and/or omirr guarantees the synchronicity of
data. Add network aliasing, heartbeats on private IP addresses, some
manipulation of init runlevels, and there you go. This linux system has been up
for four years! It's in a hospital environment that requires 24x7 access.
Thanks to rolling updates, it's running at 2.2.7 now. :-) Two disk failures and
one root-compromise attack have been the only real challenges.

Kris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/