Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ResourceErrorMessage":"Failure condition satisfied." #120

Open
turian opened this issue Dec 4, 2022 · 1 comment
Open

"ResourceErrorMessage":"Failure condition satisfied." #120

turian opened this issue Dec 4, 2022 · 1 comment

Comments

@turian
Copy link

turian commented Dec 4, 2022

Ugh, #103 and #108 appear to be back with the latest images. Can you please help? I believe I have adhered to all the improvements we learned in the previous threads.

spotty.yaml:

project:
  name: sss
  syncFilters:
    - exclude:
        - '*.ipynb'
        - '*.log'
        - '*.sw*'
        - '*/__pycache__/*'
        - '.ipynb_checkpoints/*'
        - '__pycache__/*'
        - .git/*
        - .idea/*
        - .mypy_cache/*
        - lightning_logs/*
        - local.py
        - wandb/*


containers:
  - projectDir: /workspace/project
    image: turian/heareval
    volumeMounts:
      - name: workspace
        mountPath: /workspace
    runtimeParameters: ['--shm-size', '50G']


instances:
  - name: spotty-sss-i1
    provider: gcp
    parameters:
      # https://cloud.google.com/compute/docs/gpus/gpu-regions-zones
      zone: europe-west4-b
      machineType: n1-standard-4
      preemptibleInstance: True
      gpu:
        type: nvidia-tesla-t4
        count: 1
      imageUri: projects/ml-images/global/images/c0-deeplearning-common-cu110-v20221107-debian-10
      volumes:
        - name: workspace
          parameters:
            size: 250
            mountDir: /workspace

gives

  Error:
  ------
  Deployment "spotty-instance-sss-spotty-sss-i1" failed.
  Error: {"ResourceType":"runtimeconfig.v1beta1.waiter","ResourceErrorCode":"412","ResourceErrorMessage":"Failure condition satisfied."}
@turian
Copy link
Author

turian commented Dec 4, 2022

I tried again using the old imageUri projects/ml-images/global/images/c0-deeplearning-common-cu113-v20211105-debian-10 but the same thing happens.

In this VM, I do spotty sh -H but the /var/log/startup-script.log is not there any more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant