Use GPU Nodes

You can use GPU cloud hosts as cluster nodes in UK8s as follows:

Create a Cluster
Add Node Nodes
Add Existing Host

Image Instructions

When using cloud host models with high cost performance graphics cards (such as high cost performance graphics cards 3, high cost performance graphics cards 5, high cost performance graphics cards 6) as nodes in UK8s cluster, you need to use the standard image Ubuntu 20.04 top-end.

High cost performance graphics cards support availability zones
- North China 2A
- Shanghai 2B
- Beijing 2B

Graphics Card	Image	Driver Version	CUDA Version
High cost performance graphics cards (High Cost Performance 3, High Cost Performance 5, High Cost Performance 6)	Ubuntu 20.04 High Value	535.113.01	12.2
Non-high cost performance graphics cards (such as T4, V100S, P40, etc.)	Centos 7.6, Ubuntu 20.04	450.80.02	11.0

Create a Cluster

When creating a cluster, in Node node configuration, select the machine type as “GPU Type G”, and then select the specific GPU card type and configuration.

Note: If you choose a high cost performance graphics card, you need to use the standard image Ubuntu 20.04 high cost performance in node image.

Add Node Nodes

When adding a Node node to an existing cluster, select the machine type as “GPU Type G”, and then select the specific GPU card type and configuration.

Add Existing Host

Add the created GPU cloud host to an existing cluster, and choose the appropriate node image.

Instructions for Use

By default, containers do not share GPUs. Each container can request one or more GPUs. A small part of the GPU cannot be requested.
The Master node of the cluster does not currently support GPU models.
The standard image provided by UK8s has installed nvidia driver. In addition, the nvidia-device-plugin component is installed by default in the cluster. After the GPU resources are added to the cluster, they can be automatically recognized and registered.

How to verify the normal use of the GPU node:

Check if the node has the resource of nvidia.com/gpu.
Run the following example to request the NVIDIA GPU using the nvidia.com/gpu resource type and check if the log result is correct.


$ cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  restartPolicy: Never
  containers:
    - name: cuda-container
      image: uhub.ucloud-global.com/uk8s/cuda-sample:vectoradd-cuda10.2
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU
  tolerations:
  - key: nvidia.com/gpu
    operator: Exists
    effect: NoSchedule
EOF


$ kubectl logs gpu-pod
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done