You can write vector embeddings to a Microsoft Azure Cosmos DB SQL API target.
If your mapping includes a RAG ingestion pipeline where you store vector embeddings in a vector
database, you can use Microsoft Azure Cosmos DB SQL API Connector to write the generated
vector embeddings to the Microsoft Azure Cosmos DB SQL API target.
Before you write vector embeddings to the Microsoft Azure Cosmos DB SQL API target,
consider the following prerequisites:
Create a container and define a
vector policy for the container in the Azure Cosmos DB account.
Based on the required embedding techniques,
configure the appropriate vector dimensions for the container in the
Vector embedding
section.
Word embedding
300 dimensions
Sentence embedding
768 dimensions
Custom LLM connection
300, 768, 1536, or 3072 dimensions
For more information about how to choose a vector embedding
technique, see
Vector Embeddding transformation
in
Transformations in the Data Integration help.
Rules and guidelines to write vector embeddings
Consider the following rules and guidelines when you write vector embeddings to
Microsoft Azure Cosmos DB SQL API:
You can write vector
embeddings only in a mapping in advanced mode.
You can write vector
embeddings only to an existing Microsoft Azure Cosmos DB SQL API target. In
the target properties, make sure to select the
Generate the
schema for the target object from the incoming fields to the Target
transformation
option.
In the Vector Embedding
transformation, ensure that the vector dimensions match the vector
dimensions configured in the container. The effiiciency of vector search
depends on how effectively the dimensions are configured within the
container.
To create an identifier for the vector, add
an Expression transformation in the mapping that uses the UUID_String
function if the identifier is not already included in the upstream
transformation.