In advanced mode, the Chunking transformation improves the effectiveness of natural
language processing (NLP) and retrieval-augmented generation (RAG) by breaking down text and
converting it to an efficient form. The transformation can split large pieces of text into
smaller segments, or chunks, and it can process text to make the data cleaner and
semantically more consistent for vector embedding.
You can pass output from a Chunking transformation to a Vector Embedding transformation to create
vector embeddings for the text. A Chunking transformation increases the content's
relevance before the Target transformation writes the embeddings and metedata to a
vector database. For more information, see Vector Embedding transformation.
The Chunking transformation can't run
in a serverless runtime environment on AWS, or on GPUs. If the transformation runs on a
GPU-enabled cluster, GPUs are disabled and the transformation consumes
CPUs.