Skip to content

Latest commit

 

History

History
398 lines (292 loc) · 13.8 KB

step-by-step-guide-for-benchmark.md

File metadata and controls

398 lines (292 loc) · 13.8 KB

Step-by-Step Guide for Benchmark

For any cloud vector database, the testing process follows the flowchart below:

4steps.svg

Below are the specific testing processes for each cloud vector database.

MyScale

Step 1. Create Cluster

Go to the MyScale official website and create a cluster. In the cluster console, record the cluster connection information: host, port, username, and password.

MyScaleConsole.jpg

Step 2. Modify the configuration

We have provided two configuration files for testing MyScale:

You need to write the cluster connection information obtained in Step 1 into the configuration files. To modify the configuration files for testing, open each file and locate the connection_params section. Update the values for host, port, user, and password with the appropriate cluster connection information obtained in Step 1. Finally, move the modified configuration file into the experiments/configurations directory. Here is an example of how the modified section may look:

"connection_params": {
  "host": "your_host.aws.dev.myscale.cloud",
  "port": 8443,
  "http_type": "http",
  "user": "your_username",
  "password": "your_password"
},

Step 3. Run the tests

python3 run.py --engines *myscale*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

MyScaleResults.jpg

Pinecone

Step 1. Create Cluster

Register with Pinecone and obtain the cluster connection information for Environment and Value. PineconeConsole.jpg

Step 2. Modify the configuration

We have provided two configuration files for testing Pinecone:

You need to write the cluster connection information obtained in Step 1 into the configuration files. Modify the connection_params section of the files and update the values for environment and api_key. Finally, move the modified configuration file into the experiments/configurations directory. Here is an example of how the modified section may look:

"connection_params": {
  "api-key": "your_api_key",
  "environment": "your_environment"
},

Step 3. Run the tests

python3 run.py --engines *pinecone*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

PineconeResults.jpg

Zilliz

Step 1. Create Cluster

You need to find the cluster connection information, including end_point, user, and password, in the Zilliz Cloud console. The user and password are the credentials you specified when creating the cluster. ZillizConsole.jpg

Step 2. Modify the configuration

We have provided two configuration files for testing Zilliz:

You need to write the cluster connection information obtained in Step 1 into the configuration files. To modify the configuration files for testing, open each file and locate the connection_params section. Update the values for end_point, cloud_user, and cloud_password with the appropriate cluster connection information obtained in Step 1. Finally, move the modified configuration file into the experiments/configurations directory.

Here is an example of how the modified section may look:

"connection_params": {
  "cloud_mode": true,
  "host": "127.0.0.1",
  "port": 19530,
  "user": "",
  "password": "",
  "end_point": "https://your_host.zillizcloud.com:19538",
  "cloud_user": "your_user",
  "cloud_password": "your_password",
  "cloud_secure": true
},

Step 3. Run the tests

python3 run.py --engines *zilliz*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

ZillizResults.jpg

Weaviate Cloud

Step 1. Create Cluster

Register with Weaviate Cloud and create a cluster. Record the cluster connection information: cluster URL and Authentication. WeaviateConsole.jpg

Step 2. Modify the configuration

We have provided two configuration files for testing Weaviate Cloud:

You need to write the cluster connection information obtained in Step 1 into the configuration files. Modify the connection_params section of the files and update the values for host and api_key. The host corresponds to the cluster URL, and the api_key is the Authentication. Finally, move the modified configuration file into the experiments/configurations directory.

Here is an example of how the modified section may look:

"connection_params": {
  "host": "https://your_host.weaviate.cloud",
  "port": 8090,
  "timeout_config": 2000,
  "api_key": "your_api_key"
},

Step 3. Run the tests

python3 run.py --engines *weaviate*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

WeaviateResults.jpg

Qdrant

Step 1. Create Cluster

Register with Qdrant Cloud and create a cluster. Record the cluster connection information: URL and API key. QdrantConsole.jpg

Step 2. Modify the configuration

We have provided three configuration files for testing Qdrant:

You need to write the cluster connection information obtained in Step 1 into the configuration files. Modify the connection_params section of the files and update the values for host and api_key. Please note that for the connection_params section, you need to remove the port from the end of the host string. Finally, move the modified configuration file into the experiments/configurations directory. Here is an example of how the modified section may look:

"connection_params": {
  "host": "https://your_host.aws.cloud.qdrant.io",
  "port": 6333,
  "grpc_port": 6334,
  "prefer_grpc": false,
  "api_key": "your_api_key"
},

Step 3. Run the tests

python3 run.py --engines *qdrant*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

QdrantResults.jpg

OpenSearch

Step 1. Create Cluster

Create an OpenSearch domain in your AWS console. When filling in the Fine-Grained Access Control information, select "Set IAM ARN as Master User" for the Master User and enter your ARN information. Record the cluster domain endpoint (without https://). OpenSearch.jpg

Step 2. Modify the configuration

We have provided two configuration files for testing Qdrant:

You need to write the cluster connection information obtained in Step 1 into the configuration files. Modify the connection_params section of the files and update the values for host, aws_access_key_id and aws_secret_access_key. Finally, move the modified configuration file into the experiments/configurations directory. Here is an example of how the modified section may look:

"connection_params": {
  "host": "your opensearch cluster domain endpoint",
  "port": 443,
  "user": "elastic",
  "password": "passwd",
  "aws_access_key_id": "your aws access key id",
  "aws_secret_access_key": "your aws secret access key",
  "region": "us-east-2",
  "service": "es"
},

Step 3. Run the tests

python3 run.py --engines *opensearch*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

OpenSearchResults.jpg

Postgres (Pgvector & Pgvecto.rs)

Step 1. Create Server

For deploying Postgres with pgvector (a Postgres plugin written in C), you can use Docker. Below is an example docker-compose.yaml configuration:

version: '3'

services:
  pgvector:
    image: ankane/pgvector:latest
    container_name: pgvector
    environment:
      - POSTGRES_USER=root
      - POSTGRES_PASSWORD=123456
    ports:
      - "5432:5432"

Similarly, for Postgres with pgvecto.rs (a Postgres plugin written in Rust), Docker can also be utilized. Here's a corresponding docker-compose.yaml example:

version: '3'

services:
  pgvector:
    image: tensorchord/pgvecto-rs:latest
    container_name: pgvector
    environment:
      - POSTGRES_USER=root
      - POSTGRES_PASSWORD=123456
    ports:
      - "5432:5432"

Step 2. Modify the configuration

We have provided four configuration files for testing.

For pgvector:

For pgvecto.rs:

After deploying your own Postgres service, you need to modify the connection_params fields in the configuration file. Additionally, you can append custom search_params for testing purposes. You can also alter upload_params to modify the parameters for index creation.

"connection_params": {
    "host": "127.0.0.1",
    "port": 5432,
    "user": "root",
    "password": "123456"
}

Step 3. Run the tests

python3 run.py --engines *pgvector*

Step 4. View the test results

cd results
grep -E 'rps|mean_precision' $(ls -t)

PGVectorResults.png