Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Microsoft Azure Cosmos DB SQL API Connector

Microsoft Azure Cosmos DB SQL API Connector

Guidelines to write vector embeddings

Guidelines to write vector embeddings

You can write vector embeddings to a Microsoft Azure Cosmos DB SQL API target.
If your mapping includes a RAG ingestion pipeline where you store vector embeddings in a vector database, you can use Microsoft Azure Cosmos DB SQL API Connector to write the generated vector embeddings to the Microsoft Azure Cosmos DB SQL API target.
Before you write vector embeddings to the Microsoft Azure Cosmos DB SQL API target, consider the following prerequisites:
  1. Create a container and define a vector policy for the container in the Azure Cosmos DB account.
  2. Based on the required embedding techniques, configure the appropriate vector dimensions for the container in the
    Vector embedding
    section.
    Word embedding
    300 dimensions
    Sentence embedding
    768 dimensions
    Custom LLM connection
    300, 768, 1536, or 3072 dimensions
    The New Container window in the Azure Cosmos DB account displays
								the vector embedding section in which you can define vector policies
								for a container.
    For more information about how to choose a vector embedding technique, see
    Vector Embeddding transformation
    in Transformations in the Data Integration help.

Rules and guidelines to write vector embeddings

Consider the following rules and guidelines when you write vector embeddings to Microsoft Azure Cosmos DB SQL API:
  • You can write vector embeddings only in a mapping in advanced mode.
  • You can write vector embeddings only to an existing Microsoft Azure Cosmos DB SQL API target. In the target properties, make sure to select the
    Generate the schema for the target object from the incoming fields to the Target transformation
    option.
  • In the Vector Embedding transformation, ensure that the vector dimensions match the vector dimensions configured in the container. The effiiciency of vector search depends on how effectively the dimensions are configured within the container.
  • To create an identifier for the vector, add an Expression transformation in the mapping that uses the UUID_String function if the identifier is not already included in the upstream transformation.

0 COMMENTS

We’d like to hear from you!