Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: use Kubernetes with GKE Autopilot instead of VMs to run book examples on a cloud GPU #17

Open
AlexBulankou opened this issue Jul 15, 2023 · 0 comments

Comments

@AlexBulankou
Copy link

AlexBulankou commented Jul 15, 2023

This repo provides instructions on how to set up GCP cloud VM instance with GPU to run examples.
I would like to recommend to take it further and use GKE Autopilot for GPU workloads instead of VMs.
Some benefits are:

  • GKE Autopilot's pay-per-use model ensures cost efficiency. Applying workloads via kubectl apply is simple, and pod deletion when idle is effortless.
  • Leverage service-based load balancing to expose Jupyter Lab, eliminating the need for port forwarding.
  • Maintenance/upgrades are managed seamlessly by GKE Autopilot, freeing users from routine system upkeep.
  • Adopting Kubernetes, a scalable and industry-standard platform, equips readers with practical experience, setting them ahead of a docker compose on a VM setup.

This is how I deployed the examples to GKE Autopilot:

  1. Build and deploy docker image:
IMAGE=<your_image> # you can also skip this step and use bulankou/gdl2:20230715 that I build
docker build -f ./docker/Dockerfile.gpu -t $IMAGE .
docker push $IMAGE .
  1. Create GKE Autopilot cluster with all default settings.
  2. Apply the following K8s manifest (kubectl apply -f <yaml>) . Make sure to update <IMAGE> below. Also note cloud.google.com/gke-accelerator: "nvidia-tesla-t4" and autopilot.gke.io/host-port-assignment annotation, that ensure that we pick the right node type as well as enable host port on Autopilot.
apiVersion: v1
kind: Pod
metadata:
  name: app
  annotations:
    autopilot.gke.io/host-port-assignment: '{"min":6006,"max":8888}'
  labels:
    service: app
spec:
  nodeSelector:
    cloud.google.com/gke-accelerator: "nvidia-tesla-t4"
  containers:
    - command: ["/bin/sh", "-c"]
      args: ["jupyter lab --ip 0.0.0.0 --port=8888 --no-browser --allow-root"]
      image: <IMAGE>
      name: app
      ports:
        - containerPort: 8888
          hostPort: 8888
        - containerPort: 6006
          hostPort: 6006
      resources:
        limits:
          nvidia.com/gpu: 1
        requests:
          cpu: "18"
          memory: "18Gi"
      tty: true
  restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: app
spec:
  type: LoadBalancer
  ports:
    - name: "8888"
      port: 8888
      targetPort: 8888
    - name: "6006"
      port: 6006
      targetPort: 6006
  selector:
    service: app
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant