Searching with IVF index

The previous document Getting started introduced some basic usage of Milvus through Python and Node SDK. This document will introduce how to use the IVF series index to speed up the efficiency of vector retrieval.

IVF_FLAT

IVF_FLAT divides vector data into nlist cluster units, and then compares distances between the target input vector and the center of each cluster. Depending on the number of clusters the system is set to query (nprobe), similarity search results are returned based on comparisons between the target input and the vectors in the most similar cluster(s) only — drastically reducing query time.

By adjusting nprobe, an ideal balance between accuracy and speed can be found for a given scenario. Results from the IVF_FLAT performance test demonstrate that query time increases sharply as both the number of target input vectors (nq), and the number of clusters to search (nprobe), increase.

IVF_FLAT is the most basic IVF index, and the encoded data stored in each unit is consistent with the original data.

Index building parameters

Parameter Description Range

nlist Number of cluster units [1, 65536]
Search parameters

Parameter Description Range

nprobe Number of units to query CPU: [1, nlist] GPU: [1, min(2048, nlist)]

For example:

In python

from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection

# create a collection
collection_name = "milvus_test1"
default_fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=d)
]
default_schema = CollectionSchema(fields=default_fields, description="test collection")
print(f"\nCreate collection...")
collection = Collection(name= collection_name, schema=default_schema)

# insert data
mr = collection.insert([xb])
print(collection.num_entities)

# create index
collection.create_index(field_name="vector",
                        index_params={'index_type': 'IVF_FLAT',
                                      'metric_type': 'L2',
                                      'params': {
                                        'nlist': 100      # int. 1~65536
                                      }})

#load
collection.load()

# search 
top_k = 10
results = collection.search(data=xq, anns_field="vector", param={
                "nprobe": 8 # int. 1~nlist(cpu), 1~min[2048, nlist](gpu)
              }, limit=top_k)
# show results
for result in results:
  print(result.ids)
  print(result.distance)

In node

import { MilvusClient } from "@zilliz/milvus2-sdk-node";

# connect Milvus
const milvusClient = new MilvusClient("localhost:19530");

# create a collection
const collection_name = "milvus_test1"
const params = {
  collection_name: collection_name,
  fields: [
    {
      name: "vector",
      description: "vector field",
      data_type: DataType.FloatVector,
      type_params: {
        dim: d,
      },
    },
    {
      name: "id",
      data_type: DataType.Int64,
      autoID: true,
      is_primary_key: true,
      description: "",
    },
  ],
};
await milvusClient.collectionManager.createCollection(params);

# insert data
await milvusClient.dataManager.insert({{
  collection_name: collection_name,
  fields_data: entities,
});

# flush data to disk.
await milvusClient.dataManager.flush({ collection_names: [collection_name] });

# create index
await milvusClient.collectionManager.createIndex({
    collection_name: collection_name,
    field_name: "vector",
    extra_params: {
      index_type: "IVF_FLAT",
      metric_type: "L2",
      params: JSON.stringify({ nlist: 100 }),
    },
 });

# Load data to memory
await milvusClient.collectionManager.loadCollection({
  collection_name: collection_name,
});

# search
const top_k = 5;
const searchParams = {
  anns_field: "vector",
  topk: top_k,
  metric_type: "L2",
  params: JSON.stringify({ nprobe: 10 }),
};

await milvusClient.dataManager.search({
  collection_name: collection_name,
  expr: "",
  vectors: [[1, 2, 3, 4, 5, 6, 7, 8]],
  search_params: searchParams,
  vector_type: 100, // Float vector -> 100
});

IVF_SQ8

IVF_FLAT does not perform any compression, so the index files it produces are roughly the same size as the original, raw non-indexed vector data. For example, if the original 1B SIFT dataset is 476 GB, its IVF_FLAT index files will be slightly larger (~470 GB). Loading all the index files into memory will consume 470 GB of storage.

When disk, CPU, or GPU memory resources are limited, IVF_SQ8 is a better option than IVF_FLAT. This index type can convert each FLOAT (4 bytes) to UINT8 (1 byte) by performing scalar quantization. This reduces disk, CPU, and GPU memory consumption by 70–75%. For the 1B SIFT dataset, the IVF_SQ8 index files require just 140 GB of storage.

Index building parameters

Parameter Description Range

nlist Number of cluster units [1, 65536]
Search parameters

Parameter Description Range

nprobe Number of units to query CPU: [1, nlist] GPU: [1, min(2048, nlist)]

For example:

In Python

from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection

# create a collection
collection_name = "milvus_test2"
default_fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=d)
]
default_schema = CollectionSchema(fields=default_fields, description="test collection")
print(f"\nCreate collection...")
collection = Collection(name= collection_name, schema=default_schema)

# insert data
mr = collection.insert([xb])
print(collection.num_entities)

# create index
collection.create_index(field_name="vector",
                        index_params={'index_type': 'IVF_SQ8',
                                      'metric_type': 'L2',
                                      'params': {
                                        'nlist': 100      # int. 1~65536
                                      }})

#load
collection.load()

# search 
top_k = 10
results = collection.search(data=xq, anns_field="vector", param={
                "nprobe": 8 # int. 1~nlist(cpu), 1~min[2048, nlist](gpu)
              }, limit=top_k)

for result in results:
  print(result.ids)
  print(result.distance)

IVF_PQ

PQ (Product Quantization) uniformly decomposes the original high-dimensional vector space into Cartesian products of m low-dimensional vector spaces, and then quantizes the decomposed low-dimensional vector spaces. Instead of calculating the distances between the target vector and the center of all the units, product quantization enables the calculation of distances between the target vector and the clustering center of each low-dimensional space and greatly reduces the time complexity and space complexity of the algorithm.

IVF_PQ performs IVF index clustering before quantizing the product of vectors. Its index file is even smaller than IVF_SQ8, but it also causes a loss of accuracy during searching vectors.

Index building parameters and search parameters vary with Milvus distribution. Select your Milvus distribution first.

Index building parameters

Parameter	Description	Range
`nlist`	Number of cluster units	[1, 65536]
`m`	Number of factors of product quantization	dim ≡ 0 (mod m)
`nbits`	[Optional] Number of bits in which each low-dimensional vector is stored.	[1, 16] (8 by default)

Search parameters

Parameter Description Range

nprobe Number of units to query [1, nlist]

For example:

In Python

from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection

# create a collection
collection_name = "milvus_test2"
default_fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=d)
]
default_schema = CollectionSchema(fields=default_fields, description="test collection")
print(f"\nCreate collection...")
collection = Collection(name= collection_name, schema=default_schema)

# insert data
mr = collection.insert([xb])
print(collection.num_entities)

# create index
collection.create_index(field_name="vector",
                        index_params={'index_type': 'IVF_pq',
                                      'metric_type': 'L2',
                                      'params': {
                                        'nlist': 100      # int. 1~65536
                                        'm': 16
                                      }})

#load
collection.load()

# search 
top_k = 10
results = collection.search(data=xq, anns_field="vector", param={
                "nprobe": 8 # int. 1~nlist(cpu), 1~min[2048, nlist](gpu)
              }, limit=top_k)

for result in results:
  print(result.ids)
  print(result.distance)