Skip to Content
GPUMultiple Pods Share a GPU

Multiple Pods Share a GPU

This solution will deploy the GPU-Share plugin. After the deployment is completed, the GPU of the cluster nodes can be scheduled by multiple Pods. Currently, it only supports command-line installation. In the future, the UK8S team will add this function to the cluster plugin according to the schedule to facilitate one-click installation.

Install and Use the GPU Sharing Plugin

⚠️ Please check the Kubernetes version before installing. The required Kubernetes version is >=1.17.4

1. Label the nodes that require GPU sharing

kubectl label node <nodeip> nodeShareGPU=true

2. Delete the original nvdia plugin in the cluster using kubectl

kubectl delete ds -n kube-system nvidia-device-plugin-daemonset

3. Use kubectl to install the GPU-Share plugin

kubectl apply -f https://docs.ucloud-global.com/uk8s/yaml/gpu-share/1.1.0.yaml

Test GPU Sharing

Test Conditions:

  1. The cluster only has one single-card GPU cloud host.
  2. The cluster has completed the plugin installation following the above three steps.
  3. The plugin pod is now in a running state.

Next, we run test-gpushare1 and test-gpushare2 respectively.

# Run test-gpushare1 kubectl apply -f https://docs.ucloud-global.com/uk8s/yaml/gpu-share/test-gpushare1.yaml # Run test-gpushare2 kubectl apply -f https://docs.ucloud-global.com/uk8s/yaml/gpu-share/test-gpushare2.yaml

Take test-gpushare1 as an example.

apiVersion: apps/v1 kind: Deployment metadata: name: test-gpushare1 labels: app: test-gpushare1 spec: selector: matchLabels: app: test-gpushare1 template: metadata: labels: app: test-gpushare1 spec: schedulerName: gpushare-scheduler containers: - name: test-gpushare1 image: uhub.ucloud-global.com/ucloud//gpu-player:share command: - python3 - /app/main.py resources: limits: # GiB ucloud.cn/gpu-mem: 1

In the limits, ucloud.cn/gpu-mem: 1 is set. Similarly, test-gpushare2 also has this setting. Then, we can observe that with only a single GPU card node in the cluster, the GPU can support two Pods at the same time.

kubectl get pod |grep test-gpushare

Monitor GPU Usage

You can monitor the resource usage of the GPU node or check by entering the GPU node and executing nvidia-smi.

Remove the GPU Sharing Plugin

Please execute the following command on the master node

kubectl delete -f https://docs.ucloud-global.com/uk8s/yaml/gpu-share/1.1.0.yaml kubectl apply -f /etc/kubernetes/yaml/nvidia-device-plugin.yaml