Container Elastic Scaling Based on Customized Metrics
Preface
HPA (Horizontal Pod Autoscaling) refers to the horizontal automatic scaling of Kubernetes Pods, which is also an API object in Kubernetes. With this scaling component, Kubernetes clusters can use monitoring metrics (such as CPU utilization) to automatically expand or shrink the number of Pods in the service. When business demands increase, HPA will automatically increase the number of Pods in the service to improve system stability. When business demands decrease, HPA will automatically reduce the number of service Pods to reduce the amount of cluster resource requests. In combination with Cluster Autoscaler, it can also achieve automatic scaling of the cluster and save IT costs.
It should be noted that the default HPA can only support scaling based on the thresholds of CPU and memory, but it can also call Prometheus via the custom metric api to implement custom metrics and achieve elastic scaling based on more flexible monitoring indicators. However, HPA cannot be used to scale controllers that cannot be scaled, such as DaemonSet.
Enable custom.metrics.k8s.io service
Before starting this step, make sure you have installed Prometheus as instructed in the previous tutorial.
Here, let’s simply introduce the working principle of HPA. By default, it uses the metrics.k8s.io local service to obtain Pod’s CPU, Memory indicators. CPU and Memory belong to core metrics, and the backend service corresponding to the metrics.k8s.io service is generally the metrics server, which is a service installed by default in UK8S.
However, if HPA wants to scale the container with non-CPU/memorial other indicators, we need to deploy a monitoring system like Prometheus to let Prometheus collect various metrics. But the metrics collected by Prometheus cannot be directly used by k8s because their data formats are incompatible. Therefore, another component Prometheus-adapter is needed to convert the metrics data format of Prometheus into a format that the K8S API interface can recognize. In addition, we need to register a service (i.e., custom.metrics.k8s.io) in K8S, so that HPA can access it through/apis/.
We declare an APIService of v1beta1.custom.metrics.k8s.io and submit it.
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
group: custom.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: prometheus-adapter
namespace: monitoring
port: 443
version: v1beta1
versionPriority: 100
The spec.service.prometheus-adapter in the above example has been installed and deployed in previous documents. After submitting the deployment, we execute “kubectl get apiservice | grep v1beta1.custom.metrics.k8s.io” to confirm that the availability of the service is True.
You can also use the following methods to see which metrics Prometheus has collected.
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/ | jq .
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespace/default/pods/*/ | jq .
curl 127.0.0.1:8080/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests
Modify the original Prometheus-adapater configuration file
To allow HPA to use the metrics collected by Prometheus, prometheus-adapter uses promql to obtain metrics, then modifies the data format, and exposes the reassembled metrics and values through its interface. HPA will access these metrics through/apis/custom.metrics.k8s.io/proxy to Prometheus-adapter’s service.
If all metrics of Prometheus are fetched and reassembled, then the efficiency of the adapter must be very low. Therefore, the adapter is designed to be configurable for the metrics to be read, allowing users to decide which monitoring metrics of Prometheus to read through the configmap.
For the syntax rules about config, please see config-workthrough. This will not be discussed here.
Since we have already installed the prometheus-adapter before, we only need to modify its configuration file and restart it now. The original configuration file only includes two Resource metrics, CPU, and Memory. We just need to add the metrics needed for HPA in front of it.
apiVersion: v1
data:
config.yaml: |
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}[1m])) by (<<.GroupBy>>)
nodeQuery: sum(1 - rate(node_cpu_seconds_total{mode="idle"}[1m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
node:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}) by (<<.GroupBy>>)
nodeQuery: sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
window: 1m
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
Taking the common number of requests as an example, add an indicator, its name is http_request, and the resource type is Pod.
apiVersion: v1
data:
config.yaml: |
rules:
- seriesQuery: '{__name__=~"^http_requests_.*",kubernetes_pod_name!="",kubernetes_namespace!=""}'
seriesFilters: []
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name:
matches: ^(.*)_(total)$
as: "${1}"
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}[1m])) by (<<.GroupBy>>)
nodeQuery: sum(1 - rate(node_cpu_seconds_total{mode="idle"}[1m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
node:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}) by (<<.GroupBy>>)
nodeQuery: sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
window: 1m
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
After modifying and submitting, if you want it to take effect immediately, we can delete the original Prometheus-adapter Pod to make the configuration file take effect immediately.
Of course, only these indicators are still slightly insufficient, the community provides an example of rules: adapater-config standard sample