Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install requirements updated to Apr 2023, CLIP model compatibility fix for Ampere GPUs #29

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

alopezrivera
Copy link

@alopezrivera alopezrivera commented Apr 14, 2023

Hi! First of all, thank you so much for making your research open! CLIPort is a really cool project :)

This is a pull request which fixes installation issues as of April 2023, and adds compatibility with PyTorch 2.0.0 (possibly later versions). This fix allows us to train, validate and test using an Ampere architecture A100 GPU (tested as well on A6000s).

Requirement updates to solve installation issues as of April 2023

  • numpy 1.20.*
    To solve [numpy] ModuleNotFoundError: No module named
    'numpy.core._multiarray_umath'

  • protobuf 3.20.*
    To solve [wanb] TypeError: Descriptors cannot not be created
    directly.

  • Pillow 9.4.0
    To solve [PIL] ImportError: cannot import name '_imaging' from
    'PIL'

  • packaging 21.3
    To solve [transformers] packaging.version.InvalidVersion: Invalid
    version: '0.10.1,<0.11'

CLIP model update for compatibility with NVIDIA Ampere and later (>sm_75) GPU architectures

load_clip

Attempting to train CLIPort with PyTorch > 1.7 (necessary to support GPU architectures >sm_75) will cause the following error:

TypeError: 'torch._C.Node' object is not subscriptable

The solution is to set jit=False in the signature of load_clip. This change follows from discussions (issues 79 and 49) in the OpenAI CLIP issue tracker and the solution implemented by the authors.

build_model

The following state_dict keys are deleted in line 491 of clip.py: "input_resolution", "context_length" and "vocab_size". This would cause a missing key error due to the "input_resolution" not being a state_dict key when initializing the second CLIPort stream.

With our version the three keys are deleted from the state_dict only if they are present in the first place.

This change follows from the discussion[1,2] OpenAI CLIP issue
tracker and final fix[3] of the following error:

```
TypeError: 'torch._C.Node' object is not subscriptable
```

[1]: openai/CLIP#79
[2]: openai/CLIP#49
[3]: openai/CLIP@db20393
To address error whereby "build_model" would attempt to delete
"state_dict" key "input_resolution", not present in the "state_dict".
@MohitShridhar
Copy link
Collaborator

Thanks @alopezrivera! ❤️ Will look into this soon.

@alopezrivera
Copy link
Author

alopezrivera commented Apr 16, 2023

Hi @MohitShridhar! On testing again I found I missed one dependency in the requirements (Pillow 6.2.1, to solve a second ImportError, see here).
With the last commit I have corrected that and fully specified the version of all required libraries :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants