UK8S - XXXCloud

Container Elastic Scaling Based on Customized Metrics

Preface

HPA (Horizontal Pod Autoscaling) refers to the horizontal automatic scaling of Kubernetes Pods, which is also an API object in Kubernetes. With this scaling component, Kubernetes clusters can use monitoring metrics (such as CPU utilization) to automatically expand or shrink the number of Pods in the service. When business demands increase, HPA will automatically increase the number of Pods in the service to improve system stability. When business demands decrease, HPA will automatically reduce the number of service Pods to reduce the amount of cluster resource requests. In combination with Cluster Autoscaler, it can also achieve automatic scaling of the cluster and save IT costs.

It should be noted that the default HPA can only support scaling based on the thresholds of CPU and memory, but it can also call Prometheus via the custom metric api to implement custom metrics and achieve elastic scaling based on more flexible monitoring indicators. However, HPA cannot be used to scale controllers that cannot be scaled, such as DaemonSet.

Enable custom.metrics.k8s.io service

Before starting this step, make sure you have installed Prometheus as instructed in the previous tutorial.

Here, let’s simply introduce the working principle of HPA. By default, it uses the metrics.k8s.io local service to obtain Pod’s CPU, Memory indicators. CPU and Memory belong to core metrics, and the backend service corresponding to the metrics.k8s.io service is generally the metrics server, which is a service installed by default in UK8S.

However, if HPA wants to scale the container with non-CPU/memorial other indicators, we need to deploy a monitoring system like Prometheus to let Prometheus collect various metrics. But the metrics collected by Prometheus cannot be directly used by k8s because their data formats are incompatible. Therefore, another component Prometheus-adapter is needed to convert the metrics data format of Prometheus into a format that the K8S API interface can recognize. In addition, we need to register a service (i.e., custom.metrics.k8s.io) in K8S, so that HPA can access it through/apis/.

We declare an APIService of v1beta1.custom.metrics.k8s.io and submit it.


apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta1.custom.metrics.k8s.io
spec:
  group: custom.metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: prometheus-adapter
    namespace: monitoring
    port: 443
  version: v1beta1
  versionPriority: 100

The spec.service.prometheus-adapter in the above example has been installed and deployed in previous documents. After submitting the deployment, we execute “kubectl get apiservice | grep v1beta1.custom.metrics.k8s.io” to confirm that the availability of the service is True.

You can also use the following methods to see which metrics Prometheus has collected.


kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/ | jq .

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespace/default/pods/*/ | jq .

curl 127.0.0.1:8080/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests

Modify the original Prometheus-adapater configuration file

To allow HPA to use the metrics collected by Prometheus, prometheus-adapter uses promql to obtain metrics, then modifies the data format, and exposes the reassembled metrics and values through its interface. HPA will access these metrics through/apis/custom.metrics.k8s.io/proxy to Prometheus-adapter’s service.

If all metrics of Prometheus are fetched and reassembled, then the efficiency of the adapter must be very low. Therefore, the adapter is designed to be configurable for the metrics to be read, allowing users to decide which monitoring metrics of Prometheus to read through the configmap.

For the syntax rules about config, please see config-workthrough . This will not be discussed here.

Since we have already installed the prometheus-adapter before, we only need to modify its configuration file and restart it now. The original configuration file only includes two Resource metrics, CPU, and Memory. We just need to add the metrics needed for HPA in front of it.


apiVersion: v1
data:
  config.yaml: |
    resourceRules:
      cpu:
        containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}[1m])) by (<<.GroupBy>>)
        nodeQuery: sum(1 - rate(node_cpu_seconds_total{mode="idle"}[1m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)
        resources:
          overrides:
            node:
              resource: node
            namespace:
              resource: namespace
            pod_name:
              resource: pod
        containerLabel: container_name
      memory:
        containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}) by (<<.GroupBy>>)
        nodeQuery: sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)
        resources:
          overrides:
            instance:
              resource: node
            namespace:
              resource: namespace
            pod_name:
              resource: pod
        containerLabel: container_name
      window: 1m
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: monitoring

Taking the common number of requests as an example, add an indicator, its name is http_request, and the resource type is Pod.


apiVersion: v1
data:
  config.yaml: |
    rules:
    - seriesQuery: '{__name__=~"^http_requests_.*",kubernetes_pod_name!="",kubernetes_namespace!=""}'
      seriesFilters: []
      resources:
        overrides:
          kubernetes_namespace:
            resource: namespace
          kubernetes_pod_name:
            resource: pod
      name:
        matches: ^(.*)_(total)$
        as: "${1}"
      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
    resourceRules:
      cpu:
        containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}[1m])) by (<<.GroupBy>>)
        nodeQuery: sum(1 - rate(node_cpu_seconds_total{mode="idle"}[1m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)
        resources:
          overrides:
            node:
              resource: node
            namespace:
              resource: namespace
            pod_name:
              resource: pod
        containerLabel: container_name
      memory:
        containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container_name!="POD",container_name!="",pod_name!=""}) by (<<.GroupBy>>)
        nodeQuery: sum(node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}) by (<<.GroupBy>>)
        resources:
          overrides:
            instance:
              resource: node
            namespace:
              resource: namespace
            pod_name:
              resource: pod
        containerLabel: container_name
      window: 1m
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: monitoring

After modifying and submitting, if you want it to take effect immediately, we can delete the original Prometheus-adapter Pod to make the configuration file take effect immediately.

Of course, only these indicators are still slightly insufficient, the community provides an example of rules: adapater-config standard sample