Sizing Kubernetes for a production-grade cluster
This topic offers best practices for sizing Kubernetes for a production-grade, high-availabliliy cluster.
- At least three master nodes
- At least three non-infrastructure worker nodes
- At least three exclusively infrastructure worker nodes
Preconfigured Component Pack CPU and memory limits
To better understand how Kubernetes is managing its resources and what Component Pack's requirements are, check out this article in the Kubernetes documentation, and see the following tables:
Application Container | Limits | Requests |
---|---|---|
analysisservice | CPU: 500m, memory 1Gi | CPU: 50m, memory 100Mi |
appregistry-client | CPU: 500m, memory 400Mi | CPU: 50m, memory 75Mi |
appregistry-service | 500m, memory 500Mi | CPU: 100m, memory 150Mi |
cnx-ingress-controller | CPU: 500m, memory 512Mi | CPU: 20m, memory 64Mi |
community-suggestions | CPU: 500m, memory 400Mi | CPU: 50m, memory 75Mi |
haproxy | CPU: 500m, memory 200Mi | 50m, memory 50Mi |
indexingservice | CPU: 500m, memory 1Gi | CPU: 200m, memory 100Mi |
itm-services | CPU: 1, memory 500Mi | CPU: 100m, memory 75Mi |
mail-service | CPU: 500m, memory 500Mi | CPU: 50m, memory 75Mi |
middleware-graphql | CPU: 1, memory 500Mi | CPU: 100m, memory 75Mi |
mw-proxy | CPU: 500m, memory 400Mi | CPU: 50m, memory 75Mi |
orient-web-client | CPU: 1, memory 1Gi | CPU: 100m, memory 75Mi |
people-idmapping | CPU: 500m, memory 400Mi | CPU: 50m, memory 75Mi |
people-migrate | CPU: 1, memory 1000Mi | CPU: 100m, memory 75Mi |
people-relation | CPU: 500m, memory 400Mi | CPU: 50m, memory 75Mi |
people-scoring | CPU: 500m, memory 1500Mi | CPU: 50m, memory 75Mi |
retrieval-service | CPU: 500m, memory 1Gi | CPU: 200m, memory 100Mi |
userprefs-service | CPU: 500m, memory 400Mi | CPU: 50m, memory 75Mi |
Infrastructure Container | Limits | Requests |
---|---|---|
es-client | CPU: 2, memory 2Gi | CPU: 100m, memory 1536Mi |
es-data | CPU: 2, memory 4Gi | CPU: 500m, memory 3Gi |
es-master | CPU: 1, memory 1Gi | CPU: 100m, memory 768Mi |
filebeat | CPU: 2, memory 2Gi | CPU: 500m, memory 512Mi |
kibana | CPU: 3, memory 4Gi | CPU: 1, memory 1Gi |
logstash | CPU: 3, memory 8Gi | CPU: 500m, memory 400Mi |
mongo | CPU: 2, memory 3096Mi | CPU: 100m, memory 100Mi |
redis-sentinel | CPU: 500m, memory 100Mi | CPU: 10m, memory 50Mi |
redis-server | CPU: 1, memory 1Gi | CPU: 50m, memory 75Mi |
sanity | CPU: 100m, memory 512Mi | CPU: 100m, memory 128Mi |
sanity-watcher | CPU: 500m, memory 100Mi | CPU: 10m, memory 50Mi |
solr | CPU: 2, memory 4Gi | CPU: 20m, memory 600Mi |
zookeeper | CPU: 500m, memory 400Mi | CPU: 10m, memory 300Mi |
Sizing the masters
The following recommendation is in line with best practices and the official Kubernetes recommendation. Note that master sizing is a function of the number of total nodes in the cluster and the number of users or requests that are going to come to the cluster. More active users mean more requests, and more requests mean more processing for a master.
For the optimal production scenario, we recommend at least three masters.
Maximum Number of Nodes in Cluster | Resource Requirements | AWS Equivalent |
---|---|---|
Up to 100 nodes (Kubernetes documentation) | 4 CPUs, 16G of RAM, 100G disk space | M4.xlarge |
Sizing the workers
Sizing any type of workers is a function of what you are going to run there and the sum of the limits of all those containers.
To run all the services shipped with Component Pack, we suggest at least three workers (each running one replica of each pod) with at least 8 cores and 32G of RAM (AWS equivalent would be m4.2xlarge type of instance). Remember that we are sizing for the scenario when everything is running using 100% capacity, not for the scenario to simply start the services without any load.
However, if you start noticing any performance issues with resource usage, try creating another three infrastructure workers, which will automatically take over the load for everything tagged to run on infrastructure workers.
Sizing the storage
Persistent volumes are a firm requirement for Component Pack, but even without it, nodes need disk space for caching the images and maintaining normal system operation.
For normal system operation, it is best for each master to have at least 100G of dedicated disk space, and for each worker at least 150G of dedicated disk space.
For persistent volume storage (used for ElasticSearch, Customizer, MongoDB, Solr, and Zookeeper), we suggest at least 200G of storage.