How failover works

An IBM® Domino® server cluster's ability to redirect requests from one server to another is called failover. When a user tries to access a database on a server that is unavailable or in heavy use, Domino® directs the user to a replica of the database on another server in the cluster.

The Cluster Manager on each cluster server sends out probes to each of the other cluster servers to determine the availability of each server. The Cluster Manager also checks continually to see which replicas are available on each server. When a user tries to access a database that is not available, the user request is redirected to a replica of the database on a different server in the cluster. Although the user connects to a replica on a different server, failover is essentially transparent to the user.

For example, consider a cluster containing three servers, where Server 1 is currently unavailable. The Cluster Managers on Server 2 and Server 3 are aware that Server 1 is unavailable.

Failover events occur as follows:

  1. An IBM® Notes® user attempts to open a database on Server 1.
  2. Notes® realizes that Server 1 is not responding.
  3. Instead of displaying a message that says the server is not responding, Notes® looks in its cluster cache to see if this server is a member of a cluster and to find the names of the other servers in the cluster. (When a Notes® client first accesses a server in a cluster, the names of all the servers in the cluster are added to the cluster cache on the client. This cache is updated every 15 minutes.)
  4. Notes® accesses the Cluster Manager on the next server listed in the cluster cache.
  5. The Cluster Manager looks in the Cluster Database Directory to find which servers in the cluster contain a replica of the desired database.
  6. The Cluster Manager looks in its server cluster cache to find the availability of each server that contains a replica. (The server cluster cache contains information about all the servers in the cluster. Cluster servers obtain this information when they send probes to the other cluster servers.)
  7. The Cluster Manager creates a list of the servers in the cluster that contain a replica of the database, sorts the list in order of availability, and sends the list to Notes®.
  8. Notes® opens the replica on the first server in the list (the most available server). If that server is no longer available, Notes® opens the replica on the next server in the list. In this example, Server 2 was the most available server.

When the Notes® client shuts down, it stores the contents of the cluster cache in the file cluster.ncf. Each time the client starts, it populates the cluster cache from the information in cluster.ncf.