Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ES VectorStore #1500

Merged
merged 10 commits into from May 14, 2024
Merged

feat: ES VectorStore #1500

merged 10 commits into from May 14, 2024

Conversation

IamWWT
Copy link
Contributor

@IamWWT IamWWT commented May 8, 2024

Description

Close #1062
Close #79
Close #1483

已经验证通过的功能:
1)可以新建知识空间(支持英文数字组合,全中文组合的名称; 不支持中英文或者中文和数字字符混合的名称)
2)可以上传文档进行EMBEDDING。
3)可以逐个删除上传的每一个文档。
4)可以搜索对话。(引入了jieba分词算法jieba.analyse.textrank,后续可以根据语义优化,以便于更精准查找。)

How Has This Been Tested?

.env 文件里面配置如下四个变量后,其余的使用流程同"chat knowledge"知识对话使用流程。
VECTOR_STORE_TYPE=ElasticSearch
ElasticSearch_URL=127.0.0.1
ElasticSearch_PORT=9200
ElasticSearch_USERNAME=elastic
ElasticSearch_PASSWORD=i=+iLw9y0Jduq86XTi6W

为了方便方便查看es搜索匹配到的知识内容,elastic_store.py 代码文件写了一个“es_search_results.txt”到项目根目录下,若不需要,注释代码即可 (关键词:result_file) 。

Checklist:

  • [ ok ] My code follows the style guidelines of this project
  • [ ok ] I have already rebased the commits and make the commit message conform to the project standard.
  • [ ok ] I have performed a self-review of my own code
  • [ ok ] I have commented my code, particularly in hard-to-understand areas
  • [ ok ] I have made corresponding changes to the documentation
  • [ ok ] Any dependent changes have been merged and published in downstream modules

@Aries-ckt
Copy link
Collaborator

Amazing feature, thanks for your contribution, we will test that and merge it.

@csunny
Copy link
Collaborator

csunny commented May 9, 2024

@IamWWT Hi, the Code style check failed. please use the black . and make fmt command to fix this. If you have problem about this, you can communicate with us at any time.

@IamWWT
Copy link
Contributor Author

IamWWT commented May 9, 2024

@IamWWT Hi, the Code style check failed. please use the black . and make fmt command to fix this. If you have problem about this, you can communicate with us at any time.

新的分支: 8b8a9b9

新的分支代码fix的执行方式:
1)black .
2)yapf -i dbgpt/storage/vector_store/elastic_store.py

@Aries-ckt
Copy link
Collaborator

hi, @IamWWT, I saw that you formatted other files that are not related to es_vector_store. Can you modify the files related to es?

@IamWWT
Copy link
Contributor Author

IamWWT commented May 11, 2024

hi, @IamWWT, I saw that you formatted other files that are not related to es_vector_store. Can you modify the files related to es?

修改第三次commit的分支: 3d7f77c

在第二次提交的基础上仅仅修改了本次新增es涉及的.py文件的:
1)black :
2065 black elastic_store.py
2066 black init.py
2069 black service.py
2072 black config.py

2)yapf :
2056 yapf -i dbgpt/storage/vector_store/elastic_store.py
2058 yapf -i dbgpt/storage/vector_store/init.py
2059 yapf -i dbgpt/app/knowledge/service.py
2060 yapf -i dbgpt/_private/config.py

@Aries-ckt
Copy link
Collaborator

Aries-ckt commented May 14, 2024

Test Success!

  1. download and run elasticsearch instance. reference https://www.elastic.co/guide/en/elasticsearch/reference/8.13/targz.html
  2. install python dependency
pip install langchain
pip install elasticsearch
  1. set ElasticSearch setting in .env
VECTOR_STORE_TYPE=ElasticSearch
ELASTICSEARCH_URL=127.0.0.1
ELASTICSEARCH_PORT=9200
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD="{YOUR_PASSWORD}"
  1. start dbgpt_server

Copy link
Collaborator

@Aries-ckt Aries-ckt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fangyinc fangyinc changed the title [New Feature] ES VectorStore feat: ES VectorStore May 14, 2024
@github-actions github-actions bot added the enhancement New feature or request label May 14, 2024
Copy link
Collaborator

@fangyinc fangyinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@csunny csunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

Close #1062
Close #79
Close #1483

@csunny csunny merged commit db4d318 into eosphoros-ai:main May 14, 2024
3 checks passed
@csunny csunny mentioned this pull request May 15, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[New Feature] ES VectorStore [Feature]:It would be great if it could support ElasticSearch
4 participants