Image Analysis Agent

When building AI applications, you often need agents that can analyze and understand visual content. Timbal lets you create image analysis agents that can identify objects, describe scenes, and answer questions about visual content using the tools parameter.

Prerequisites

This example uses the openai model. Make sure to add OPENAI_API_KEY to your .env file.

.env

OPENAI_API_KEY=your_api_key_here

Creating an agent

Create a simple agent that analyzes images to identify objects, describe scenes, and answer questions about visual content.

from timbal.core import Agent

image_analysis_agent = Agent(
  name="image-analysis",
  description="Analyzes images to identify objects and describe scenes",
  system_prompt="""You can view an image and identify objects, describe scenes, and answer questions about the content.
  You can also determine species of animals and describe locations in the image.""",
  model="openai/gpt-4o"
)

Creating a function

This function provides a sample image URL for testing the agent's image analysis capabilities.

import random

def get_sample_image() -> str:
  """Get a sample image URL for testing image analysis."""
  sample_images = [
      "https://images.unsplash.com/photo-1441974231531-c6227db76b6e?w=800",  # Forest
      "https://images.unsplash.com/photo-1506905925346-21bda4d32df4?w=800",  # Mountains
      "https://images.unsplash.com/photo-1541961017774-22349e4a1262?w=800",  # Bird
      "https://images.unsplash.com/photo-1558618666-fcd25c85cd64?w=800",    # Cat
  ]

  return random.choice(sample_images)

Example usage

Use the agent directly by calling it with a prompt message that includes an image.

from timbal.types.file import File

async def main():
  image_url = get_sample_image()
  
  # Create a message with image and text for the agent
  prompt = [File.validate(image_url), "Analyze this image and identify the main objects or subjects. If there are animals, provide their common name and scientific name. Also describe the location or setting in one or two short sentences."]
  
  # Call the image analysis agent directly
  response = await image_analysis_agent(prompt=prompt).collect()
  
  # Extract the text response
  response_text = response.output.content[0].text
  print(response_text)

if __name__ == "__main__":
  asyncio.run(main())

Image Analysis↗

Prerequisites​

Creating an agent​

Creating a function​

Example usage​

Prerequisites

Creating an agent

Creating a function

Example usage