Replication and server topology

As the number of IBM® Domino® servers on your network increases, so does the amount of replication required to distribute information across the network. Because replication uses memory and processing time, plan how servers connect to perform replication. If you allow servers to replicate at random, so that a given server replicates a single database with multiple servers, or perhaps replicates different databases with different servers, servers can become so overloaded with replication requests that it interferes with their ability to respond to client requests.

To provide for efficient replication, consider setting up some servers as dedicated replication servers. Using dedicated servers to handle replication greatly reduces the amount of work that database servers have to devote to replication, because the database servers have to replicate with the replication servers only, instead of having to replicate with every server that maintains a copy of a given database. To control replication, you create Connection documents that specify which servers to replicate with and when.

How you connect servers for replication depends on many factors, including the layout of physical network and the size of your organization, as well as the extent to which you want to re-use existing Connection documents created for mail routing. There are several different configurations, or topologies, you can use to control how replication occurs between servers:

Hub-and-spoke
Peer-to-peer
Ring

Choose the replication strategy that provides the most efficient replication performance. In many cases, you will use different topologies in different parts of the network.

Using a hub-and-spoke topology to manage replication

A hub-and-spoke topology is generally the most common and efficient replication topology in larger organizations, because it minimizes network traffic. Hub-and-spoke replication establishes one central server as the hub, which schedules and initiates all replication with all of the other servers, or spokes. The spokes update the hub server by replication (and mail routing), and the hub in turn updates each spoke. Hub servers replicate with each other or with master hub servers in organizations that use more than one hub. In short, the hub server acts as the traffic manager of the system, overseeing system resources, ensuring that replication takes place with each spoke in an orderly way, and guaranteeing that all changes are replicated to all spoke servers.

To set up replication in a hub-and-spoke system, you create one Connection document for each hub-and-spoke connection. To ensure that the replication task on the hub, rather than the spokes, assumes most of the work always, in each Connection document specify the hub server as the source server, the spoke server as the destination server, and pull-push as the replication method.

A hub-and-spoke topology can be especially useful at large, multiple-server sites or in a centralized office that needs to connect via phone or leased lines to smaller, regional offices. If you have a large site, you can use a combination of topologies -- for example, two hub-and-spoke arrangements and one peer-to-peer arrangement between the two hub servers.

The major drawback of hub-and-spoke topology is that it is vulnerable to single point of failure if the hub is not working. Deploying a backup server that replicates the hub and can quickly be reconfigured into a hub server if the primary hub goes down can alleviate this shortcoming.

Benefits of a hub-and-spoke topology

Install multiple protocols on hub servers to enable communication in a Domino system that uses more than one protocol. This places hub servers in multiple IBM Notes® named networks, another source of efficiency. Hub servers can connect multiple Notes named networks, where a single hub server and its spoke servers often make up one Notes named network.
Bridge parts of a network -- for example, a LAN and a WAN.
Centralize administration of the Domino Directory, standardize database ACLs, and limit access to the hub. You can designate the hub with Manager access and the spokes with Reader access so that you make those changes on one replica on the hub to synchronize the spokes.
Designate hubs by role -- for example, replication hubs and mail hubs.
Place server programs such as message transfer agents on hubs to make them easily accessible.
Connect remote sites with a hub server.
Minimize network traffic and maximize network efficiency.
Centralize data backup at the hub. By backing up databases on the hub only, you conserve resources on spoke servers.
Improve server load balancing. However, network traffic increases on the hub LAN segment. If you have more than 25 servers per hub, establish tiers of hubs. If a hub goes down, replication for that hub and its spokes is disabled until the hub is repaired or replaced.

Note: Do not use hub-and-spoke replication for databases larger than 100MB that have replicas on less than four servers. Instead, schedule replication for these databases to occur separately from other replications.

Using a peer-to-peer topology to manage replication

In a peer-to-peer topology, replication is less centralized than in a hub-and-spoke configuration, with every server being connected to every other server. Because peer-to-peer replication quickly disseminates changes to all servers, it is often the best choice for use in small organizations, or for sharing databases locally among a few servers. However, it can be inefficient when a database resides on more than a few servers.

In a peer-to-peer topology, the potential for replication problems decreases, because only two servers communicate for each replication and no hub or intermediary servers are involved. However, peer-to-peer replication requires many Connection documents, increases administration since you must avoid overlap in replication schedules, and prevents you from standardizing ACL requirements.

Other topology strategies

Another method of managing replication is to use Cluster replication. This ensures constant access to data, because data on one server is duplicated on one or more cluster mates. If the primary server becomes unavailable, data can be obtained from other servers in the cluster.

Other replication topologies include:

End-to-end - Also known as a chain topology, connects two or more servers in a chain. Information travels in one direction along the chain and then travels back in the other direction. End-to-end replication is less efficient than ring replication but is useful in situations where information needs to travel in only one direction.
Ring - Similar to an end-to-end topology, but connects servers in a circle so that replication occurs within a closed loop. Ring replication can be useful in a large organization for replicating information between hub servers.
Binary tree - Connects servers in a pyramid fashion: the first (topmost) server connects to two servers after, each of which connects to two servers after, and so on. Information travels down the pyramid and then back up.

Using existing mail routing connections for replication

As you plan for replication, consider re-using the connections you may have already set up for Notes mail routing. If you previously created a Connection document for mail routing, you can easily enable the replication task on that document.

Unlike mail routing, which works in one direction and requires a pair of Connection documents to enable two-way routing, replication between servers works in both directions, and requires only one Connection document between each pair of servers. Because the server that initiates replication takes on the larger share of the replication workload, if decide to add replication to one of the Connection documents already used for mail routing between two servers, add the replication task to the document on the more powerful server in the pair.