Docs
uk8s
Cluster Elastic Scaling
Overview

Overview

Understanding Elastic Scaling in K8S

Kubernetes provides different components in multiple dimensions and levels to meet the scaling needs of different scenarios, mainly as below:

TypePodNode
Horizontal ScalingHPA(Horizontal Pod Autoscaler)Cluster Autoscaler
Vertical ScalingVPA (Vertical Pod Autoscaler)None
  • HPA: Refers to the Horizontal Pod Autoscaler, which is responsible for horizontal scaling of Pods. It is the oldest among all the scaling components. Currently, it supports autoscaling/v1, autoscaling/v2beta1, and autoscaling/v2beta2. Wherein autoscaling/v1 only supports one scaling indicator - CPU, custom metrics were added support in autoscaling/v2beta1, and external metrics have been added support in autoscaling/v2beta2.

  • CA: Refers to the Cluster Autoscaler which is responsible for the horizontal scaling of the Node components. It has been GA(General Availability) since the 1.0.0 version, and the GA version is used in UK8S.

  • VPA: Refers to the Vertical Pod Autoscaler, which dynamically adjusts the Request value of the load according to the resource utilization, historical data, and abnormal events of the Pod. It is mainly focused on the resource scaling scenarios of stateful services and monolithic applications. Currently (as of August 26, 2019), it is in the beta phase and not recommended for use in production environments.

In addition, there is a cluster-proportional-autoscaler component, which can adjust the number of Pods horizontally according to the number of nodes in the cluster. It is currently in the GA phase and used to dynamically adjust the size of key services such as CoreDNS and Ingress based on the scale of the cluster. There is also an addon-resizer component that can adjust the Request load vertically based on the number of nodes in the cluster, which is currently in beta phase.

Next, we mainly introduce two most commonly used scale components - HPA and CA.