Use GPU Nodes
You can use GPU cloud hosts as cluster nodes in UK8s as follows:
Image Instructions
When using cloud host models with high cost performance graphics cards (such as high cost performance graphics cards 3, high cost performance graphics cards 5, high cost performance graphics cards 6) as nodes in UK8s cluster, you need to use the standard image Ubuntu 20.04 top-end
.
- High cost performance graphics cards support availability zones
- North China 2A
- Shanghai 2B
- Beijing 2B
Graphics Card | Image | Driver Version | CUDA Version |
---|---|---|---|
High cost performance graphics cards (High Cost Performance 3, High Cost Performance 5, High Cost Performance 6) | Ubuntu 20.04 High Value | 535.113.01 | 12.2 |
Non-high cost performance graphics cards (such as T4, V100S, P40, etc.) | Centos 7.6, Ubuntu 20.04 | 450.80.02 | 11.0 |
Create a Cluster
When creating a cluster, in Node node configuration, select the machine type as “GPU Type G”, and then select the specific GPU card type and configuration.
Note: If you choose a high cost performance graphics card, you need to use the standard image Ubuntu 20.04 high cost performance
in node image.
Add Node Nodes
When adding a Node node to an existing cluster, select the machine type as “GPU Type G”, and then select the specific GPU card type and configuration.
Add Existing Host
Add the created GPU cloud host to an existing cluster, and choose the appropriate node image.
Instructions for Use
- By default, containers do not share GPUs. Each container can request one or more GPUs. A small part of the GPU cannot be requested.
- The Master node of the cluster does not currently support GPU models.
- The standard image provided by UK8s has installed nvidia driver. In addition, the
nvidia-device-plugin
component is installed by default in the cluster. After the GPU resources are added to the cluster, they can be automatically recognized and registered. - How to verify the normal use of the GPU node:
-
Check if the node has the resource of
nvidia.com/gpu
. -
Run the following example to request the NVIDIA GPU using the
nvidia.com/gpu
resource type and check if the log result is correct.
$ cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: gpu-pod spec: restartPolicy: Never containers: - name: cuda-container image: uhub.{{domainName}}/uk8s/cuda-sample:vectoradd-cuda10.2 resources: limits: nvidia.com/gpu: 1 # requesting 1 GPU tolerations: - key: nvidia.com/gpu operator: Exists effect: NoSchedule EOF
$ kubectl logs gpu-pod [Vector addition of 50000 elements] Copy input data from the host memory to the CUDA device CUDA kernel launch with 196 blocks of 256 threads Copy output data from the CUDA device to the host memory Test PASSED Done
-