Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Integration of VLM embedding model #446

Merged
merged 57 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
d300035
add e5 embedding
Wendong-Fan Nov 23, 2023
524bfd4
fix typo in toml file
Wendong-Fan Nov 25, 2023
0f13021
allow user to switch embeeding model from SentenceTransformer
Wendong-Fan Nov 30, 2023
9ddc871
Move the import to __init__
Wendong-Fan Nov 30, 2023
e9c3135
polish docstring
Wendong-Fan Nov 30, 2023
aeae92d
remove # type: ignore
Wendong-Fan Nov 30, 2023
b431e67
change embed_list return type and polish docstring
Wendong-Fan Nov 30, 2023
884f190
use Union[List[List[float]], ndarray] instead of List[List[float]] | …
Wendong-Fan Nov 30, 2023
9cce263
change return of embed_list from ndarray to list
Wendong-Fan Dec 3, 2023
4c7b67c
change name from SentenceTransformerEmbedding into SentenceTransforme…
Wendong-Fan Dec 3, 2023
939808e
update poetry
lightaime Dec 3, 2023
e8ce692
update poetry
lightaime Dec 3, 2023
1bf7320
update poetry
lightaime Dec 3, 2023
653b381
update poetry
lightaime Dec 3, 2023
93e795e
remove ndarry and union in embedding base file
Wendong-Fan Dec 8, 2023
a50b478
Merge branch 'master' into feature/open_source_embedding_model
Wendong-Fan Dec 8, 2023
4d5ba2d
sentence-transformer
FUYICC Jan 30, 2024
692a670
integration of clip embedding and update of license
FUYICC Feb 1, 2024
20654fd
Limit embed_list input type
FUYICC Feb 3, 2024
9e0de62
revert changes of sentence embedding
FUYICC Feb 5, 2024
b3ea26c
poetry change of pillow
FUYICC Feb 5, 2024
f1adf18
change of docstring of functions
FUYICC Feb 5, 2024
8ca7195
change of get_output_dim function
FUYICC Feb 24, 2024
c0f2b85
fix of bugs of embedding dim
FUYICC Feb 24, 2024
955bf11
allow the clip embedding accept both texts and images
FUYICC Feb 25, 2024
afb46bf
fix the bug for pytest
FUYICC Feb 28, 2024
9f98e8a
fix the bug for poetry.lock
FUYICC Feb 29, 2024
79e6d8d
refactor: refactor CLIPEmbedding class to improve readability and doc…
Appointat Mar 8, 2024
2e16ed6
chore: remove empty line in pyproject.toml
Appointat Mar 8, 2024
d5e10fb
chore: add specific test cases for image and text embeddings
Appointat Mar 8, 2024
f41f3c2
fix: fix error handling in CLIPEmbedding class
Appointat Mar 8, 2024
ddf78af
typo: fix default value capitalization in CLIPEmbedding class
Appointat Mar 8, 2024
8fe17cb
Use generics to support the type system
FUYICC Mar 11, 2024
1afd27b
store dimension into a variable
FUYICC Mar 11, 2024
e8d073d
Update update_license.py for windows compatibility
FUYICC Mar 12, 2024
f0a1573
Change to general visual language model class and use lazy initializa…
FUYICC Apr 9, 2024
0fc220d
Merge branch 'master' into CLIP_model
FUYICC Apr 12, 2024
1fa0c0f
test for inconsistancy of inputs with different types
FUYICC Apr 12, 2024
71d48a2
update of poetry
FUYICC Apr 12, 2024
4de4fad
usage of **kwargs
FUYICC Apr 12, 2024
1517d52
debug for pytest
FUYICC May 2, 2024
2105510
Merge branch 'master' into CLIP_model
FUYICC May 3, 2024
ed54edf
poetry dependency
FUYICC May 3, 2024
a667614
ruff
FUYICC May 3, 2024
8aab43d
poetry
FUYICC May 3, 2024
8c1f086
return list of float
FUYICC May 5, 2024
b8bd94e
change of tests
FUYICC May 5, 2024
de718ce
Update camel/embeddings/vlm_embedding.py
FUYICC May 27, 2024
6ebf5cd
Update camel/embeddings/vlm_embedding.py
FUYICC May 27, 2024
c969597
Update camel/embeddings/vlm_embedding.py
FUYICC May 27, 2024
6b2c48e
Update camel/embeddings/vlm_embedding.py
FUYICC May 27, 2024
b0cadb0
Update camel/embeddings/vlm_embedding.py
FUYICC May 27, 2024
e2c7824
one method for **kwargs
FUYICC May 27, 2024
1c23c64
split of kwargs
FUYICC Jun 2, 2024
487dfca
add pillow into tool.poetry.extras
FUYICC Jun 2, 2024
908bb91
Merge branch 'master' into CLIP_model
FUYICC Jun 2, 2024
6b5db36
poetry lock
FUYICC Jun 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions camel/embeddings/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@
# =========== Copyright 2023 @ CAMEL-AI.org. All Rights Reserved. ===========
from .base import BaseEmbedding
from .openai_embedding import OpenAIEmbedding
from .clip_embedding import CLIPEmbedding

__all__ = [
"BaseEmbedding",
"OpenAIEmbedding",
"CLIPEmbedding",
]
91 changes: 91 additions & 0 deletions camel/embeddings/clip_embedding.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# =========== Copyright 2023 @ CAMEL-AI.org. All Rights Reserved. ===========
# Licensed under the Apache License, Version 2.0 (the “License”);
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an “AS IS” BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =========== Copyright 2023 @ CAMEL-AI.org. All Rights Reserved. ===========
from typing import Any, List, Union

from PIL import Image

from camel.embeddings import BaseEmbedding


class CLIPEmbedding(BaseEmbedding[Union[str, Image.Image]]):
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved
r"""Provides image embedding functionalities using CLIP model.

Args:
model_name : The model type to be used for generating embeddings.
And the default value is: obj:`openai/clip-vit-base-patch32`.

Raises:
RuntimeError: If an unsupported model type is specified.
"""

def __init__(self,
model_name: str = "openai/clip-vit-base-patch32") -> None:
r"""Initializes the: obj: `CLIPEmbedding` class with a specified model
and return the dimension of embeddings.

Args:
model_name (str, optional): The version name of the model to use.
(default: :obj:`openai/clip-vit-base-patch32`)
"""

from transformers import CLIPModel, CLIPProcessor
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved
self.model = CLIPModel.from_pretrained(model_name)
self.processor = CLIPProcessor.from_pretrained(model_name)
text = 'dimension'
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved
inputs = self.processor(text=[text], return_tensors="pt")
self.dim = self.model.get_text_features(**inputs).shape[1]

def embed_list(
self,
objs: List[Union[Image.Image, str]], # to do
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved
**kwargs: Any,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is noticed that the **kwargs parameter doesn't appear to be utilized within the function's implementation, and it's also not covered by the existing unit tests. Could we enhance our test suite by including tests that verify the handling of **kwargs? This would ensure that all aspects of the function's behavior are thoroughly tested." Thanks.

) -> List[List[float]]:
r"""Generates embeddings for the given images or texts.

Args:
objs (List[Image.Image|str]): The list of images or texts for
which to generate the embeddings.
**kwargs (Any): Extra kwargs passed to the embedding API.

Returns:
List[List[float]]: A list that represents the generated embedding
as a list of floating-point numbers.
"""
if not objs:
raise ValueError("Input text list is empty.")
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved
result_list = []
for obj in objs:
if isinstance(obj, Image.Image):
input = self.processor(images=obj, return_tensors="pt",
padding=True)
image_feature = self.model.get_image_features(**input).tolist()
result_list.extend(image_feature)
elif isinstance(obj, str):
input = self.processor(text=obj, return_tensors="pt",
padding=True)
text_feature = self.model.get_text_features(**input).tolist()
result_list.extend(text_feature)

else:
raise ValueError("Input type is not image nor text.")
return result_list

def get_output_dim(self) -> int:
dandansamax marked this conversation as resolved.
Show resolved Hide resolved
r"""Returns the output dimension of the embeddings.

Returns:
int: The dimensionality of the embedding for the current model.
"""

return self.dim
4 changes: 2 additions & 2 deletions licenses/update_license.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ def update_license_in_file(
start_line_start_with: str,
end_line_start_with: str,
) -> bool:
with open(file_path, 'r') as f:
with open(file_path, 'r', encoding='utf-8') as f: # for windows compatibility
content = f.read()

with open(license_template_path, 'r') as f:
with open(license_template_path, 'r', encoding='utf-8') as f:
new_license = f.read().strip()

maybe_existing_licenses = re.findall(r'^#.*?(?=\n)', content,
Expand Down
3 changes: 2 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ PyMuPDF = { version = "^1.22.5", optional = true }
wikipedia = { version = "^1", optional = true }
pyowm = { version = "^3.3.0", optional = true }
unstructured = { version = "^0.10.30", optional = true }
pillow = { version = "^10.2.0", optional = true }
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved

# vector-databases
qdrant-client = { version = "^1.6.4", optional = true }
Expand Down
69 changes: 69 additions & 0 deletions test/embeddings/test_clip_embeddings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# =========== Copyright 2023 @ CAMEL-AI.org. All Rights Reserved. ===========
# Licensed under the Apache License, Version 2.0 (the “License”);
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an “AS IS” BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =========== Copyright 2023 @ CAMEL-AI.org. All Rights Reserved. ===========
import pytest
import requests
from PIL import Image
from transformers import CLIPModel, CLIPProcessor

from camel.embeddings import CLIPEmbedding


def test_CLIPEmbedding_initialization():
embedding = CLIPEmbedding()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create some mock tests for the embedding instead of download the model every time (which is quite expensive)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right.

assert embedding is not None
assert isinstance(embedding.model, CLIPModel)
assert isinstance(embedding.processor, CLIPProcessor)


def test_image_embed_list_with_valid_input():
embedding = CLIPEmbedding()
# Test with the specific images
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
test_images = [image, image]
embeddings = embedding.embed_list(test_images)
assert isinstance(embeddings, list)
assert len(embeddings) == 2
for e in embeddings:
assert len(e) == embedding.get_output_dim()


def test_image_embed_list_with_empty_input():
embedding = CLIPEmbedding()
with pytest.raises(ValueError):
embedding.embed_list([])


def test_text_embed_list_with_valid_input():
embedding = CLIPEmbedding()
# Test with the specific texts
test_texts = ['Hello world', 'Testing sentence embeddings']
embeddings = embedding.embed_list(test_texts)
assert isinstance(embeddings, list)
assert len(embeddings) == 2
for e in embeddings:
assert len(e) == embedding.get_output_dim()
Wendong-Fan marked this conversation as resolved.
Show resolved Hide resolved


def test_text_embed_list_with_empty_input():
embedding = CLIPEmbedding()
with pytest.raises(ValueError):
embedding.embed_list([])


def test_get_output_dim():
embedding = CLIPEmbedding()
output_dim = embedding.get_output_dim()
assert isinstance(output_dim, int)
assert output_dim > 0