Docs
uk8s
GPU
Use GPU Nodes

Use GPU Nodes

You can use GPU cloud hosts as cluster nodes in UK8s as follows:

Image Instructions

When using cloud host models with high cost performance graphics cards (such as high cost performance graphics cards 3, high cost performance graphics cards 5, high cost performance graphics cards 6) as nodes in UK8s cluster, you need to use the standard image Ubuntu 20.04 top-end.

  • High cost performance graphics cards support availability zones
    • North China 2A
    • Shanghai 2B
    • Beijing 2B
Graphics CardImageDriver VersionCUDA Version
High cost performance graphics cards (High Cost Performance 3, High Cost Performance 5, High Cost Performance 6)Ubuntu 20.04 High Value535.113.0112.2
Non-high cost performance graphics cards (such as T4, V100S, P40, etc.)Centos 7.6, Ubuntu 20.04450.80.0211.0

Create a Cluster

When creating a cluster, in Node node configuration, select the machine type as “GPU Type G”, and then select the specific GPU card type and configuration.

Note: If you choose a high cost performance graphics card, you need to use the standard image Ubuntu 20.04 high cost performance in node image.

Add Node Nodes

When adding a Node node to an existing cluster, select the machine type as “GPU Type G”, and then select the specific GPU card type and configuration.

Add Existing Host

Add the created GPU cloud host to an existing cluster, and choose the appropriate node image.

Instructions for Use

  1. By default, containers do not share GPUs. Each container can request one or more GPUs. A small part of the GPU cannot be requested.
  2. The Master node of the cluster does not currently support GPU models.
  3. The standard image provided by UK8s has installed nvidia driver. In addition, the nvidia-device-plugin component is installed by default in the cluster. After the GPU resources are added to the cluster, they can be automatically recognized and registered.
  4. How to verify the normal use of the GPU node:
    1. Check if the node has the resource of nvidia.com/gpu.

    2. Run the following example to request the NVIDIA GPU using the nvidia.com/gpu resource type and check if the log result is correct.

    $ cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: gpu-pod
    spec:
      restartPolicy: Never
      containers:
        - name: cuda-container
          image: uhub.{{domainName}}/uk8s/cuda-sample:vectoradd-cuda10.2
          resources:
            limits:
              nvidia.com/gpu: 1 # requesting 1 GPU
      tolerations:
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule
    EOF
    $ kubectl logs gpu-pod
    [Vector addition of 50000 elements]
    Copy input data from the host memory to the CUDA device
    CUDA kernel launch with 196 blocks of 256 threads
    Copy output data from the CUDA device to the host memory
    Test PASSED
    Done