Skip to main content

Overview

The Describe Image Block uses Vision Language Model (VLM) capabilities to analyze and describe images. By providing an image and an optional natural language prompt, you can instruct the AI to focus on specific aspects or provide general descriptions of the image content.
Describe Image Screenshot

Inputs

systemPrompt
string
The system prompt to send to the model. Optional. Used to provide high-level guidance to the AI model.
prompt
chat-message | chat-message[]
The prompt message or messages to send to the model. Only available if “Use Prompt Input” is enabled in settings.
image
image
required
The input image to be analyzed. Required. The image will be converted to a data URI before being sent to the model.

Outputs

output
string
The resulting description of the image. The content and focus of this description will depend on the input image and any provided prompts.

Editor Settings

model
string
default:"gpt-4o"
The AI vision model used to describe the image. Available models are dynamically populated based on the LLM provider configuration.
usePromptInput
boolean
default:false
When enabled, allows the prompt to be provided via an input port instead of being set in the settings.
prompt
string
The prompt to use when “Use Prompt Input” is disabled. This text will be sent to the model along with the image.
maxTokens
number
default:2048
The maximum number of tokens to generate in the response.
temperature
number
default:0
The sampling temperature to use. Lower values produce more focused and deterministic outputs, while higher values allow for more creativity in descriptions.
Available settings may vary depending on the selected LLM provider and model.

Example: Analyzing a Chart Image

  1. Add a Describe Image block to your flow.
  2. Connect your input image (e.g., a chart or graph) to the image input of the Describe Image block.
  3. Add a Text block with a prompt like “Describe the main trends and key data points in this chart” and connect it to the prompt input if using prompt input mode.
  4. Select your desired model in the Describe Image block settings.
  5. Run your flow. The block will output a detailed description of the chart, focusing on the trends and key data points.

Error Handling

  • If the input image is empty, invalid, or in an unsupported format, the block will return an error.
  • If the AI provider fails to analyze the image, the block will retry up to 3 times with exponential backoff (1-10 seconds between retries).
  • If the image is too large or complex for the model to process, the block may return an error or a partial description.
Always validate the output of the Describe Image block, especially when using it for critical applications or decision-making processes.

FAQ

The block can analyze a wide variety of images, including photographs, charts, graphs, diagrams, and more. The effectiveness may vary depending on the complexity of the image and the capabilities of the chosen AI model.
The level of detail in the descriptions can vary based on the complexity of the image, the specificity of the prompt (if provided), and the capabilities of the chosen AI model. You can often get more detailed or focused descriptions by using specific prompts.
While the block can generally describe the contents of an image, including objects and people, it typically doesn’t identify specific individuals. The level of object recognition depends on the AI model’s training and capabilities.

See Also

I