Table of Contents

Search

  1. Preface
  2. Transformations
  3. Source transformation
  4. Target transformation
  5. Access Policy transformation
  6. Aggregator transformation
  7. B2B transformation
  8. Chunking transformation
  9. Cleanse transformation
  10. Data Masking transformation
  11. Data Services transformation
  12. Deduplicate transformation
  13. Expression transformation
  14. Filter transformation
  15. Hierarchy Builder transformation
  16. Hierarchy Parser transformation
  17. Hierarchy Processor transformation
  18. Input transformation
  19. Java transformation
  20. Java transformation API reference
  21. Joiner transformation
  22. Labeler transformation
  23. Lookup transformation
  24. Machine Learning transformation
  25. Mapplet transformation
  26. Normalizer transformation
  27. Output transformation
  28. Parse transformation
  29. Python transformation
  30. Rank transformation
  31. Router transformation
  32. Rule Specification transformation
  33. Sequence transformation
  34. Sorter transformation
  35. SQL transformation
  36. Structure Parser transformation
  37. Transaction Control transformation
  38. Union transformation
  39. Vector Embedding transformation
  40. Velocity transformation
  41. Verifier transformation
  42. Web Services transformation

Transformations

Transformations

Vector Embedding transformation

Vector Embedding transformation

In advanced mode, you can use a Vector Embedding transformation to generate vector embeddings for input text, capturing the semantic meaning of the text in a vector format.
Before using a Vector Embedding transformation, use a Chunking transformation to split the text into chunks. Then, the Vector Embedding transformation can generate vector embeddings for each chunk of text using an embedding model like Word2Vec or BERT. For more information about the Chunking transformation, see Chunking transformation.
To create an identifer for each vector, you can use either the UUID_STRING function in an Expression transformation or a Sequence Generator transformation:
  • If you use the UUID_STRING function in an Expression transformation, use the function without passing any arguments. The function returns a globally unique ID that can be stored in a string field with a precision of 100.
    UUID_STRING is an internal function that you can use only in advanced mode. Using it to create identifiers for other use cases might produce unexpected results.
  • If you use a Sequence Generator transformation, create a shared sequence to use across all mappings that load data to the same index in the vector database.
A Target transformation can write the vectors to a vector database.
The Vector Embedding transformation can't run in a serverless runtime environment, on an
advanced cluster
on Google Cloud, or on GPUs. If the transformation runs on a GPU-enabled cluster, GPUs are disabled and the transformation consumes CPUs.

0 COMMENTS

We’d like to hear from you!