New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contribute the WordLift Vector Store #13028
base: main
Are you sure you want to change the base?
Conversation
…wordlift-vector-store
…2107-add-support-for-the-wordlift-vector-store
…' of github.com:wordlift/llama_index into feature/12107-add-support-for-the-wordlift-vector-store
…2107-add-support-for-the-wordlift-vector-store
…-wordlift-vector-store Feature/12107 add support for the wordlift vector store
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@ziodave Seems like the tests aren't working. Have you tried running them locally? |
@logan-markewich yes, sorry I converted the PR to draft until we fix it. May I ask, we recreated the ubuntu-latest-unit-tester in our organization GH Runners so that we can have the GH Actions run on our fork, https://github.com/wordlift/llama_index/actions/runs/8800952266. Oddly enough the tests pass there, is there a special configuration we need to apply to the GH Runner? This is basically the configuration we did: |
@ziodave I don't think any special config is needed. The errors in the test seem unrelated to env though
|
@logan-markewich we're on it, we'll update the PR soon. Thanks! |
@logan-markewich we're ready for review 🙏 |
llama-index-integrations/vector_stores/llama-index-vector-stores-wordlift/README.md
Outdated
Show resolved
Hide resolved
def __init__(self, key: str): | ||
self.key = key | ||
|
||
async def for_add(self, nodes: List[BaseNode]) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why these methods have to be async? Or what this is even for actually (they all return the same thing 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if I'm understanding properly, I can't use just any embedding model with wordlift, it has to be a specific one.
I'm unfamiliar with wordlift, but what's the reason for that restriction?
node_id=node.node_id, | ||
embeddings=node.get_embedding(), | ||
text=node.get_content(metadata_mode=MetadataMode.NONE) or "", | ||
metadata={}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to not store the node metadata here?
for node in nodes: | ||
node_dict = node.dict() | ||
metadata: Dict[str, Any] = node_dict.get("metadata", {}) | ||
entity_id = metadata.get("entity_id", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is an entity id required? What happens if it's not provided?
(I really should go read more about wordlift haha)
from manager_client.rest import RESTResponseType | ||
|
||
|
||
class VectorSearchQueriesApi: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any plans to publish this as its own package? Like a client SDK? Its fine if it lives here for now though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks mostly good to me, just worried about a few UX things that might trip up users. I wonder if we can smooth out some of this or not?
Hello @logan-markewich yes, we'll review your comments and be back soon with updates and answers. |
Description
Add support for the WordLift Vector Store.
Fixes #12107
New Package?
Did I fill in the
tool.llamahub
section in thepyproject.toml
and provide a detailed README.md for my new integration or package?Version Bump?
Did I bump the version in the
pyproject.toml
file of the package I am updating? (Except for thellama-index-core
package)Type of Change
How Has This Been Tested?
Suggested Checklist:
make format; make lint
to appease the lint gods