set up gpu extender plugin on kubernetes 1.26.3.

Worker Node

0. Prepare GPU Node

Master Node

Official Github Link

1. Deploy GPU share scheduler extender in control plane

kubectl create -f https://raw.githubusercontent.com/AliyunContainerService/gpushare-scheduler-extender/master/config/gpushare-schd-extender.yaml

2. Modify scheduler configuration

The goal is to include scheduler-policy-config.json into the scheduler configuration (/etc/kubernetes/manifests/kube-scheduler.yaml).

Notice: If your Kubernetes default scheduler is deployed as static pod, don’t edit the yaml file inside /etc/kubernetes/manifest. You need to edit the yaml file outside the /etc/kubernetes/manifest directory. and copy the yaml file you edited to the ‘/etc/kubernetes/manifest/’ directory, and then kubernetes will update the default static pod with the yaml file automatically.

2.1 Kubernetes v1.23+

From Kubernetes v1.23 scheduling policies are no longer supported instead scheduler configurations should be used. That means scheduler-policy-config.yaml needs to be included in the scheduler config (/etc/kubernetes/manifests/kube-scheduler.yaml).

Here is the sample of the final modified kube-scheduler.yaml

2.1.1 Copy scheduler config file into /etc/kubernetes
cd /etc/kubernetes
curl -O https://raw.githubusercontent.com/AliyunContainerService/gpushare-scheduler-extender/master/config/scheduler-policy-config.yaml

scheduler-policy-config.yaml要注意:

  • "urlPrefix"的地址和端口如果svc读不到时,可以设置为schd-extender的节点地址和端口,默认为12345,不是32677(服务端口)。
  • nodeCacheCapable为true
2.1.2 Add Policy config file parameter in scheduler arguments
- --config=/etc/kubernetes/scheduler-policy-config.yaml
2.1.3 Add volume mount into Pod Spec
- mountPath: /etc/kubernetes/scheduler-policy-config.yaml
  name: scheduler-policy-config
  readOnly: true
- hostPath:
      path: /etc/kubernetes/scheduler-policy-config.yaml
      type: FileOrCreate
  name: scheduler-policy-config

3.Deploy Device plugins

kubectl delete ds -n kube-system nvidia-device-plugin-daemonset
kubectl create -f https://raw.githubusercontent.com/AliyunContainerService/gpushare-device-plugin/master/device-plugin-rbac.yaml

kubectl create -f https://raw.githubusercontent.com/AliyunContainerService/gpushare-device-plugin/master/device-plugin-ds.yaml

4.Add gpushare node labels to the nodes requiring GPU sharing

You need to add a label “gpushare=true” to all node where you want to install device plugin because the device plugin is deamonset.

kubectl label node <target_node> gpushare=true

For example:

kubectl label node mynode gpushare=true

5. Install Kubectl extension

The device plugin will expose the GPU memory capacity and keep track of the GPU memory allocation:

The kubectl extension is only available in linux for now, so you will have to install kubectl and the extension on a linux machine:

curl http://124.221.159.211/manifests/nvidia-gpu/kubectl-inspect-gpushare
sudo cp ~/kubectl-inspect-gpushare /usr/local/bin/
sudo chmod 755 /usr/local/bin/kubectl-inspect-gpushare

Then, run inspector to show the GPU memory

rgsoft@k8s-master-2305:~/Downloads/gpu_extender$ kubectl inspect gpushare
NAME            IPADDRESS  GPU0(Allocated/Total)  GPU Memory(GiB)
rgsoft-ms-7b78  10.8.0.6   0/12                   0/12
------------------------------------------------------
Allocated/Total GPU Memory In Cluster:
0/12 (0%)

After that, run pod with specify schedulers

注意:如果出现无法调度情况,可是试着去掉schedulerName: gpushare-scheduler

apiVersion: batch/v1
kind: Job
metadata:
  name: gpu-share-sample
spec:
  parallelism: 1
  template:
    metadata:
      labels:
        app: gpu-share-sample
    spec:
      schedulerName: gpushare-scheduler  #important!!!!!
      containers:
      - name: gpu-share-sample
        image: registry.cn-hangzhou.aliyuncs.com/ai-samples/gpushare-sample:tensorflow-1.5
        command:
        - python
        - tensorflow-sample-code/tfjob/docker/mnist/main.py
        - --max_steps=100000
        - --data_dir=tensorflow-sample-code/data
        resources:
          limits:
            aliyun.com/gpu-mem: 3
        workingDir: /root
      restartPolicy: Never
kubectl logs gpu-share-sample-vrpsj --tail 1
2023-03-23 09:51:02.301985: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)