GridGuard Clustering

Provides a high level overview of the GridGuard server cluster configuration

Overview

GridGuard clustering allows for multiple GridGuard servers to be configured in such a way that they seem like a single server to external entities. Any number of servers can be joined together in a cluster. Clustering provides 2 critical benefits:

  1. Load Balancing & Scalability - By clustering multiple servers, the system load can be spread across multiple servers. This allows for a higher throughput in terms of the transaction processing rate and allows for the system to scale horizontally based on the expected load.
  2. High Availability - Clustering eliminates single points of failure as far as the GridGuard server is concerned. If a server in a cluster goes down, the other servers in the cluster will continue to function normally and allow for un-interrupted service.

Typically, clusters are no larger than 2 or 3 servers. The marginal benefits of adding new servers to the cluster goes down with each server that is added to the cluster; this is because of the additional overhead that is incurred to ensure that the data is synchronized across all the nodes in the cluster. It is more efficient to increase the CPU and memory specs of the GridGuard virtual machine than it is to add new servers to a cluster.

GridGuard Cluster

Some important points to node about a a GridGuard cluster:

  1. All nodes are considered peers; there is no single master node.
  2. Only the LDAP database (PIN, Corner, GridPic, GridSoftToken & GridKey settings, and optionally the nonce store) is replicated across the nodes. The server configuration is never replicated. It is up to the Administrator to ensure that each node is configured appropriately. If changes are made to the configuration of any node, the Administrator should also take care to ensure that necessary changes, if any, are also applied to all other nodes in the cluster.
  3. Nodes can be configured to optionally replicate the nonce-store. This is necessary only when load balancing is required. For an HA/DR scenario, nonce-replication is not required.
    When nonce replication is enabled, it is critical that the network latency between the nodes in the cluster be extremely small. This is because there is a lot more data that is being synchronized in nonce-replication mode and it is critical that there be little to no latency in replication for the cluster to function without errors.
  4. After a node has goes down (for any reason), when it comes back up, it will re-synchronize itself with the cluster. No manual intervention is necessary.

Network Architecture

Network Architecture

The basic network architecture for a clustered setup includes the following components:

  1. First GridGuard server
  2. A secondary GridGuard server setup in a cluster with the first
  3. An HTTPS Reverse Proxy that is used to distribute the load between the 2 GridGuard servers. Load distribution can be based on any algorithm supported by the Reverse Proxy server including round-robin, ratio based or geo-location based.
  4. A security device or appliance to which authentication is being requested
  5. A Virtual IP that is used by the security appliance or service to perform LDAP binds to verify credentials.
  6. Clients accessing services from the extranet

Note: The HTTSP Reverse Proxy and LDAP Virtual IP servers are not provided by SyferLock.