Re: Hot pluggable CPUs ( was Linux 2.5 / 2.6 TODO (preliminary) )

From: James Sutherland (jas88@cam.ac.uk)
Date: Sat Jun 03 2000 - 14:44:37 EST


On Sat, 3 Jun 2000, Bruce Guenter wrote:

> On Sat, Jun 03, 2000 at 06:25:37PM +0100, James Sutherland wrote:
> > Every "component" is mounted on a carrier board; this then connects to
> > a
> > pair of backplanes. Each individual component can, obviously, be
> > replaced;
> > you can also remove/disable one backplane at once without downtime.
>
> So, you've essentially got two complete systems (once you add up all the
> components) in a single box.

No. I have the same components, but organised to make one single machine
with N+N redundancy, rather than a pair of independent machines with no
redundancy at all.

> What does this buy you above having two completely independant boxes?

Redundancy. Your approach gives you two machines, each with, say, 99.99%
availability. Mine gives a single machine with, perhaps, 99.9999%. Two
machines without redundancy have much lower availability.

> I wouldn't be surprised if a single box with all the redundant
> components costs more than the total price of two seperate boxes.

Yes - you are paying through the nose for the extra 9s of availability.
There are markets where the client is more than happy to do so; in mission
critical apps, double the price for an extra 9 is a bargain.

> BTW, does hardware like this exist yet? I've seen Compaq's with
> hot-swap CPU and RAM support, but nothing with dual motherboards.

Compaq themselves don't do this sort of HA kit. Look at Nortel Networks'
exchanges, routers etc., or Tandem's Himalaya systems - they implement a
larger-scale version of what I describe, and have done for years.

> > The next issue is to enable software upgrades without downtime. For
> > applications, this can be done by installing the new version, then
> > signalling the old version to "exec" the new one. (Apache can do
> > something
> > similar with configuration files already.) For a WWW server, for
> > example,
> > this can be done without dropping or refusing a single connection.
>
> Note that having seperate boxes would solve this problem as well. Just
> upgrade one, bring it back up, and then upgrade the other.

Except that having two machines, one of which is working, is NOT
equivalent to having one machine always operational.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 21:00:17 EST