Multimodal Chat Completion with Google Vertex AI

Multimodal Chat Completion with Google Vertex AI

Step 3. Configure and publish the process

Step 3. Configure and publish the process

Configure the Google Vertex AI LLM model version and grounding to connect the model output to verifiable sources of information and publish the process.
  1. Open the
    Chat with Google Vertex AI
    process.
  2. Optionally, on the
    Assignments
    tab of the
    Set Flow Configuration
    step, enter values for the following fields:
    • In the
      Model_LLM
      field, enter the model ID of the LLM model. The default ID is
      gemini-1.5-flash-002
      .
    • Clear the check box corresponding to the
      Grounding
      field. It is enabled by default. Grounding is the ability to connect the model output to verifiable sources of information.
    • In the
      Generation_Config
      field, enter the prompt instructions using the Expression Editor, as shown in the following sample code:
      <generationConfig> <maxOutputTokens>8192</maxOutputTokens> <temperature>1</temperature> <topP>0.95</topP> </generationConfig>
      For the
      Generation_Config
      field, enter values for the following properties:
      Property
      Description
      temperature
      Controls the randomness of the model's output. A lower value close to 0 makes the output more deterministic, while a higher value close to 1 increases randomness and creativity. For example, if
      temperature
      is set to 0.5, the model balances between deterministic and creative outputs.
      topP
      Determines the cumulative probability threshold for token selection. The model considers the smallest set of tokens whose cumulative probability meets or exceeds
      topP
      . For example, if
      topP
      is set to 0.1, the model considers only the top 10% most probable tokens at each step.
      maxOutputTokens
      Defines the maximum number of tokens that the model can generate in its response. The value can't exceed the model's context length. Most of the models have a context length of 2048 tokens.
  3. Save and publish the process.

0 COMMENTS

We’d like to hear from you!