Skip to main content

Vector Search

Description — Use Vector Search to build adaptive and user-focused applications using Generative AI.
Important

This feature is an Enterprise Edition feature.

note

Vector search requires the Enterprise edition and is supported on ARM64 and x86-64. On x86-64, the CPU must also support the AVX2 instruction set.

Vector search compares vector embeddings instead of raw text. This lets your application retrieve documents that are semantically similar to a query item, even when the exact words do not match.

In Couchbase Lite for Dart, vector indexes are configured with VectorIndexConfiguration. Queries then use SQL++ functions such as APPROX_VECTOR_DISTANCE() to rank or filter matching documents. For the setup steps and end-to-end examples, see Working with Vector Search.

Vector search retrieves semantically similar items by comparing vector embeddings in a multi-dimensional space. It is a core building block for generative AI and predictive AI workloads, because it allows applications to search by meaning rather than simple keyword matches.

Embeddings are arrays of numbers generated by machine learning models. They can represent text, images, audio, and other inputs in a form that makes similarity search efficient.

Once you have chosen the model that produces your embeddings, you can create vector indexes to store them efficiently and query the indexed data for similar results.

Vector search is useful in a range of mobile and edge workloads, including:

  • Semantic and similarity search for offline-first mobile and IoT applications.
  • Recommendation systems that compare users, items, or behavior by semantic similarity instead of exact metadata matches.
  • Retrieval-augmented generation (RAG), where relevant local data is retrieved and passed to an LLM as context.

Running vector search locally on-device or close to the edge also has practical benefits:

  • It supports cloud-to-edge similarity search patterns using the same Couchbase ecosystem.
  • Sensitive data and queries can stay on the device.
  • Local execution reduces network latency and variability.
  • Offloading searches from cloud-hosted models can reduce transfer and infrastructure costs.

Key Concepts of Vector Search in Couchbase Lite​

When working with vector search in Couchbase Lite, a few concepts drive both indexing behavior and query quality.

About Vector Embeddings​

Vector embeddings are numeric representations produced by a machine learning model. They capture the semantic or contextual relationships between items, so inputs the model considers similar end up closer together in vector space.

Couchbase Lite supports embeddings stored as:

  • Arrays of 32-bit floating-point numbers.
  • Base64 strings containing little-endian 32-bit floating-point values.

About Vector Indexes​

Vector indexes store and manage vector embeddings so similarity queries can run efficiently. Before they can be used effectively, vector indexes need to be trained so Couchbase Lite can compute centroids and the parameters required for encoding.

Training starts automatically on the first vector search query once the number of available vectors satisfies the configured minimum training size. If there are not enough vectors yet, Couchbase Lite logs a message indicating how many are required.

Vector index training can affect query performance. If a query runs before the index is trained, Couchbase Lite falls back to a full scan over the stored vectors.

About Lazy Vector Indexes​

important

Lazy indexing is not automatic. Your application is responsible for scheduling index updates explicitly.

Lazy vector indexes let you update vector indexes asynchronously instead of recomputing vectors as part of normal document writes. This is useful when:

  • Documents are created before an embedding model is available.
  • Your embedding model is remote and may be temporarily unavailable.
  • You want explicit control over when indexing work runs and how many documents are processed per batch.

Lazy indexing is separate from saving documents. Your application decides when to compute vectors, how many pending items to process, and which items to skip for a later retry.

FeatureRegular IndexLazy Index
Updates when documents changeYesNo
Updates when documents are deleted/purgedYesYes
Application controls when updates happenNoYes
Application can skip an updateNoYes
Indexing runs asynchronouslyNoYes

About Vector Encoding​

Vector encoding reduces the size of a vector index through compression. This can reduce disk usage and I/O cost during indexing and querying, but stronger compression can also reduce accuracy.

Couchbase Lite for Dart supports the following encodings through VectorEncoding:

  • VectorEncoding.none() for no compression and the highest quality.
  • VectorEncoding.scalarQuantizer(...) for scalar quantization using 4, 6, or 8 bits per dimension.
  • VectorEncoding.productQuantizer(...) for product quantization, which can preserve better quality than scalar quantization at higher complexity.

About Centroids​

Centroids act as cluster centers inside a vector index. During training, embeddings are grouped into buckets around these centroids using k-means clustering.

More centroids can improve accuracy, but they also increase indexing cost. A useful rule of thumb is to start with roughly the square root of the number of documents.

About Probes​

The number of probes controls how many centroid buckets Couchbase Lite examines when searching for nearby vectors. More probes can improve recall, but they also increase query work.

When setting a custom numProbes, Couchbase recommends using at least 8 or about 0.5% of the number of centroids.

About Dimensions​

Dimensions are the width of each embedding vector: the number of numeric components in the vector. Higher-dimensional vectors often preserve more model information, but they also require more compute, memory, and storage.

Supported dimensions range from 2 to 4096.

About Distance Metrics​

Distance metrics define how similarity is measured inside a vector index. Couchbase Lite for Dart exposes them as DistanceMetric.

Hybrid vector search combines similarity search with traditional filtering or keyword search. A common pattern is to use vector search together with Full Text Search, so documents first match a keyword or metadata constraint and are then ranked by semantic similarity.

In hybrid search, vector similarity is applied only to the documents that have already passed the WHERE clause filters.

For practical hybrid query examples, see Working with Vector Search.

See Also​