mirrored machines via network

James R. Leu (jleu@chaos.coredcs.com)
Fri, 1 Nov 1996 16:23:52 -0600 (CST)


First off please let me apologize for my lack of kernel programming
knowledge, this is my first venture into kernel space. My basic goals of
this letter are 1) a reality check and 2) to obtain a list of resources of
where to look or what to read to get me started.

As my title below states I'm in charge of administrating all the network
operations of a small ISP. I am looking into ways of implementing redundant
servers. By this I mean two machine that are exactly the same as far
as services provided, and data stored. There only deviation would be the
IP address assigned to it. My example will hopefully reveal why I want to
achieve this.

The idea I have, necessitates two main developments. The first of which is
a "intelligent" name-server. I have begun development of this already. I have
started by using th bind-4.x.x source. The end result will hopefully be a
name-server that checks if a machine is responding, before it gives out the IP
address. The second of which is the "mirrored machines via the network".
Because of my limited knowledge of kernel I/O I'm not really sure how to
implement such a driver/service. My initial looks into the I/O subsystem led
me into the drivers/block directory of the Linux kernel source. I look at hd.c
and could understand what was happening, but I don't understand how it fits in
to the whole scheme of things.

The best way to illustrate how the entire system would work would be to give an
example. I will refer to the primary machine as machine A, and the backup
machine as machine B. There is at least a third machine on the network, and
this operates as the primary DNS for the network. Under normal situations
(ie no failed hardware, no kernel panics :o) machine A response to all WWW
services all e-mail services, and all authentification requests and usage
accounting (via RADIUS). The machine also serves shell logins. During this
time machine B needs to constantly be updates to account for changes in web
pages, changes in passwords, in general all changes to the file system on A need
to be duplicated on B. The reason it needs to be duplicated arises when
machine A blows up. At this time the primary name server would notice that
machine A is not responding correctly. It would switch its internal tables for
machine A and now point them to machine B.

I do realize there are many areas that need to be worked out. The least of
which is how to determine if a machine is no longer responding, but I hope with
some input from this community, to be able to overcome each problem and form
a truly reliable solution. I welcome all responses/tips/hints/flames.

Last but not least, if anyone has seen this or anything similar, please let me
know. I looked long and hard for something similar, but I'm sure my search
was not entirely exhaustive. Also I know my view of this may be naive, so
please do enlighten me.

James

-- 
James R. Leu
Network Administrator
CORE Digital Communication Services
jleu@coredcs.com