Skip to main content
Version: Next

Google Gemini Image Generation Agent Example


This guide demonstrates how to create a mailbox agent that integrates with Google's Gemini 2.5 Flash Image model using the chat protocol. We'll build an agent that can generate images from text prompts and communicate with ASI:One LLM and other agents through natural language, leveraging Gemini's powerful image generation capabilities.

Prerequisites

Before you begin, ensure you have:

Project Setup

1. Create Project Directory

mkdir gemini-image-agent
cd gemini-image-agent

2. Create Required Files

You'll need three files:

  • agent.py - Main agent implementation
  • .env - Environment variables and API keys
  • requirements.txt - Python dependencies

Implementation

Complete Agent Implementation

Create agent.py with the following code:

agent.py
from __future__ import annotations

from uuid import uuid4
from datetime import datetime, timezone
from typing import Any
import os

from uagents import Agent, Context, Protocol
from uagents_core.contrib.protocols.chat import (
ChatAcknowledgement,
ChatMessage,
EndSessionContent,
StartSessionContent,
TextContent,
MetadataContent,
Resource,
ResourceContent,
chat_protocol_spec,
)
from io import BytesIO

try:
from google import genai
except Exception as e:
raise RuntimeError("google-genai is required. Please install 'google-genai'.") from e

from uagents_core.storage import ExternalStorage


def create_text_chat(text: str, end_session: bool = False) -> ChatMessage:
"""Helper to create text-based chat messages"""
content: list[Any] = [TextContent(type="text", text=text)]
if end_session:
content.append(EndSessionContent(type="end-session"))
return ChatMessage(timestamp=datetime.now(timezone.utc), msg_id=uuid4(), content=content)


# Agent configuration from environment
AGENT_NAME = "gemini_image_agent"
AGENT_PORT = int(os.getenv("GEMINI_AGENT_PORT", "8042"))
AGENT_SEED = "gemini image generation agent"

agent = Agent(name=AGENT_NAME, mailbox=True, port=AGENT_PORT, seed=AGENT_SEED)

chat_proto = Protocol(spec=chat_protocol_spec)

# Agentverse ExternalStorage config
AGENTVERSE_URL = os.getenv("AGENTVERSE_URL", "https://agentverse.ai")
STORAGE_URL = f"{AGENTVERSE_URL}/v1/storage"


def _get_gemini_client() -> "genai.Client":
"""Initialize Gemini client with API key from environment"""
api_key = os.getenv("GOOGLE_API_KEY")
if api_key:
return genai.Client(api_key=api_key)
return genai.Client()


def generate_image(prompt: str) -> tuple[bytes, str | None]:
"""Generate image using Gemini 2.5 Flash Image model"""
client = _get_gemini_client()
resp = client.models.generate_content(
model="gemini-2.5-flash-image",
contents=[prompt],
)
# Extract image data from response
for cand in getattr(resp, "candidates", []) or []:
content = getattr(cand, "content", None)
if not content:
continue
for part in getattr(content, "parts", []) or []:
inline = getattr(part, "inline_data", None)
if inline is not None and getattr(inline, "data", None):
data = inline.data
mime_type = getattr(inline, "mime_type", None)
return data, mime_type
raise RuntimeError("No inline image returned by Gemini.")


def _convert_image_if_needed(image_bytes: bytes, mime_type: str | None) -> tuple[bytes, str]:
"""
Convert non-PNG/JPEG images (e.g., WEBP) to PNG for broad compatibility.
Keeps PNG/JPEG as-is. Returns (bytes, mime_type).
"""
mt = (mime_type or "image/png").lower().split(";")[0]
if mt in ("image/png", "image/jpeg", "image/jpg"):
# Normalize jpg to jpeg
return image_bytes, ("image/jpeg" if mt == "image/jpg" else mt)
try:
from PIL import Image
with BytesIO(image_bytes) as bio:
img = Image.open(bio)
# Use PNG to preserve transparency if present
out = BytesIO()
img.save(out, format="PNG")
return out.getvalue(), "image/png"
except Exception:
# Fallback to original
return image_bytes, mt


@chat_proto.on_message(ChatMessage)
async def handle_message(ctx: Context, sender: str, msg: ChatMessage):
"""Handle incoming chat messages and generate images"""

# Send acknowledgement
await ctx.send(
sender,
ChatAcknowledgement(timestamp=datetime.now(timezone.utc), acknowledged_msg_id=msg.msg_id),
)

for item in msg.content:
# Handle session start - advertise capabilities
if isinstance(item, StartSessionContent):
await ctx.send(
sender,
ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[MetadataContent(type="metadata", metadata={"attachments": "true"})],
),
)
continue

# Handle text prompts for image generation
if isinstance(item, TextContent):
prompt = (item.text or "").strip()
if not prompt:
await ctx.send(sender, create_text_chat(
"Please provide a prompt to generate an image.",
end_session=True
))
return

ctx.logger.info(f"Gemini image request from {sender}: {prompt}")

# Send status update
await ctx.send(sender, create_text_chat("Generating your image now…"))

# Generate image with Gemini
try:
image_bytes, mime_type = generate_image(prompt)
except Exception as e:
ctx.logger.error(f"Gemini client error: {e}")
await ctx.send(sender, create_text_chat(
"Sorry, the image API failed. Please try again later.",
end_session=True
))
return

if not image_bytes:
await ctx.send(sender, create_text_chat(
"Model did not return an image.",
end_session=True
))
return

# Convert to compatible format if needed
image_bytes, content_type = _convert_image_if_needed(image_bytes, mime_type)
content_type = (content_type or "image/png").split(";")[0].lower()

# Upload to Agentverse ExternalStorage
try:
api_token = os.getenv("AGENTVERSE_API_KEY")
if api_token:
storage = ExternalStorage(api_token=api_token, storage_url=STORAGE_URL)
else:
storage = ExternalStorage(identity=ctx.agent.identity, storage_url=STORAGE_URL)

# Create asset in storage
asset_id = storage.create_asset(
name=str(ctx.session),
content=image_bytes,
mime_type=content_type,
)
# Set permissions so sender can view it
storage.set_permissions(asset_id=asset_id, agent_address=sender)

# Send image as ResourceContent
await ctx.send(
sender,
ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[
ResourceContent(
type="resource",
resource_id=asset_id,
resource=Resource(
uri=f"agent-storage://{STORAGE_URL}/{asset_id}",
metadata={"mime_type": content_type, "role": "generated-image"},
),
),
EndSessionContent(type="end-session"),
],
),
)
return
except Exception as e:
ctx.logger.error(f"ExternalStorage upload failed: {e}")
await ctx.send(sender, create_text_chat(
"Failed to publish image. Please try again.",
end_session=True
))
return

# Ignore other content types
continue


@chat_proto.on_message(ChatAcknowledgement)
async def handle_acknowledgement(ctx: Context, sender: str, msg: ChatAcknowledgement):
"""Handle message acknowledgements"""
ctx.logger.info(f"Got a chat acknowledgement from {sender}: {msg}")


# Include the chat protocol
agent.include(chat_proto, publish_manifest=True)


if __name__ == "__main__":
agent.run()

Understanding ResourceContent

Unlike the text-based Gemini Pro agent that uses TextContent, this image generation agent uses ResourceContent to send generated images. Here's the key difference:

TextContent (Used for Text Messages)

# Sending text responses
TextContent(type="text", text="Hello, world!")

Use cases:

  • Plain text responses
  • Conversation messages
  • Status updates

ResourceContent (Used for Images & Files)

# Sending image responses
ResourceContent(
type="resource",
resource_id=asset_id, # ID from ExternalStorage
resource=Resource(
uri=f"agent-storage://{STORAGE_URL}/{asset_id}",
metadata={"mime_type": "image/png", "role": "generated-image"}
)
)

Use cases:

  • Images (generated or uploaded)
  • Files and documents
  • Audio/video content
  • Any binary data

Why ResourceContent for Images?

  1. Efficient: Images are stored once and referenced by ID, not duplicated in every message
  2. Secure: Permissions control who can access the resource
  3. Compatible: Works seamlessly with ASI:One and other agents
  4. Metadata: Can include MIME type, role, and other information

ExternalStorage for Images

The agent uses Agentverse's ExternalStorage service to store and share images:

# Initialize storage
storage = ExternalStorage(api_token=api_token, storage_url=STORAGE_URL)

# Upload image
asset_id = storage.create_asset(
name=str(ctx.session),
content=image_bytes,
mime_type="image/png"
)

# Set permissions (important!)
storage.set_permissions(asset_id=asset_id, agent_address=sender)

Key points:

  • Images are uploaded to Agentverse storage
  • Each image gets a unique asset_id
  • Permissions must be set so recipients can view the image
  • The URI follows the format: agent-storage://{STORAGE_URL}/{asset_id}

Configuration Files

Requirements File

Create requirements.txt:

requirements.txt
uagents==0.22.10
uagents-core==0.3.11
google-genai>=0.3.0
python-dotenv>=1.0.1
Pillow>=10.0.0

Package purposes:

  • uagents - Core agent framework
  • uagents-core - Chat protocol and storage utilities
  • google-genai - Google Gemini API client
  • python-dotenv - Load environment variables from .env file
  • Pillow - Image format conversion (WEBP → PNG)

Environment Variables

Create .env file with your API keys:

.env
# Google Gemini API Key
# Obtain from: https://aistudio.google.com/apikey
GOOGLE_API_KEY=your_google_api_key_here

# Agentverse API Key (Required for ExternalStorage)
# Obtain from: https://agentverse.ai/profile/api-keys
AGENTVERSE_API_KEY=your_agentverse_api_key_here

# Agentverse URL (optional, defaults to https://agentverse.ai)
AGENTVERSE_URL=https://agentverse.ai

# Agent Configuration (optional)
GEMINI_AGENT_PORT=8042
AGENT_NAME=gemini_image_agent
AGENT_SEED=gemini image generation agent

Important: Never commit your .env file to version control! Add it to .gitignore:

echo ".env" >> .gitignore

Getting Your API Keys

1. Google AI API Key

  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the key and add it to your .env file as GOOGLE_API_KEY

2. Agentverse API Key

Why is this needed? Mailbox agents need an API key to authenticate with Agentverse ExternalStorage for uploading and sharing images.

To get your Agentverse API key, follow the detailed guide here: How to get Agentverse API Key

Once you have your API key, add it to your .env file as AGENTVERSE_API_KEY.

Note: Hosted agents on Agentverse can use their agent identity for storage instead of an API key. Mailbox agents running locally require an explicit API key.

Installation & Running

1. Install Dependencies

pip install -r requirements.txt

2. Configure Environment

Make sure your .env file is set up with both API keys:

# Verify your .env file
cat .env

3. Run the Agent

python agent.py

You should see output like:

xyv@Fetchs-MacBook-Pro gemini-image-agent % python3 agent.py
INFO: [gemini_image_agent]: Starting agent with address: agent1qdzf55y0at40rezdtlhgzf4q67ym6xz0lh0c82j798q5eescxh2f6nh2vj5
INFO: [gemini_image_agent]: Agent inspector available at https://agentverse.ai/inspect/?uri=http%3A//127.0.0.1%3A8042&address=agent1qdzf55y0at40rezdtlhgzf4q67ym6xz0lh0c82j798q5eescxh2f6nh2vj5
INFO: [gemini_image_agent]: Starting server on http://0.0.0.0:8042 (Press CTRL+C to quit)
INFO: [gemini_image_agent]: Starting mailbox client for https://agentverse.ai
INFO: [gemini_image_agent]: Mailbox access token acquired
INFO: [gemini_image_agent]: Manifest published successfully: AgentChatProtocol
INFO: [uagents.registration]: Registration on Almanac API successful
INFO: [uagents.registration]: Almanac contract registration is up to date!

Note: The agent connects to the mailbox service automatically when mailbox=True is set. This allows your local agent to receive messages from other agents and ASI:One. For more details on mailbox connectivity, see Mailbox Agents.

4. Register Your Agent on Agentverse

To make your agent discoverable by ASI:One:

  1. Keep your agent running locally
  2. Visit the Agent inspector link shown in the output above
  3. Connect your agent on Agentverse

Local Agent Registration

Adding a README to your Agent

A well-written README helps make your agent easily discoverable by ASI:One LLM and other agents. Once your agent is registered on Agentverse, follow these steps to add or edit your agent's README:

  1. Navigate to your agent's profile page on Agentverse

  2. Click the Edit button next to Readme.md

  3. Add a comprehensive description:

    In the Readme.md editor, add a detailed description of your agent. Include:

    • Purpose: What your agent does (image generation from text)
    • Functionalities: Key capabilities (image types, styles supported)
    • Usage guidelines: How to interact with the agent (prompt best practices)
    • Technical specifications: Model used (Gemini 2.5 Flash Image), protocols, storage
    • Examples: Sample prompts and use cases

    Example sections to include:

    • Image types: photorealistic, artistic, logos, scenes, abstract art
    • Best practices: Be specific, include style information, specify technical details
    • Example prompts: "A sunset over mountain peaks", "A minimalist tech logo", etc.
    • Use cases: Marketing, creative projects, social media content
    • Response time: 5-15 seconds typical
    • Limitations: One image per request, follows content policy

    A good description helps ASI:One LLM understand when to select your agent for image generation tasks. Please refer to the Importance of Good Readme section for more details.

  4. Make sure that your agent is enabled with Agent Chat Protocol v0.3.0

    Verify that the chat protocol is properly configured and the agent is using the latest protocol version.

Testing Your Agent

1. Test via Agentverse Chat Interface

Once your agent is running and registered:

  1. Go to your agent's page on Agentverse
  2. Click "Chat with Agent" button
  3. Send a prompt like: "Generate an image of a sunset over mountains"
  4. Wait for the image to generate and display

Agentverse Chat Test

2. Query from ASI:One LLM

  1. Login to ASI:One LLM:

    Visit ASI:One and log in with your Google Account or ASI:One Wallet

  2. Toggle the Agents switch:

    Enable the "Agents" toggle to allow ASI:One to connect with Agentverse agents

  3. Use your agent's handle:

    Type @gemini-image-agent (or your agent's handle) followed by your prompt:

    @gemini-image-agent Create a photorealistic image of a golden retriever puppy playing in a garden
  4. View the generated image:

    Your agent will process the request, generate the image with Gemini, and send it back through ASI:One

ASI Image Test

Note: Make sure your agent is running locally and connected to the mailbox. If ASI:One can't reach your agent, verify that:

  • Your agent is running (python agent.py)
  • It's registered on Agentverse under "Local Agents"
  • Your firewall isn't blocking the agent port

Add Conversation History

Store previous requests to maintain context:

@agent.on_event("startup")
async def startup(ctx: Context):
ctx.storage.set("image_history", {})

# In handle_message, track history:
history = ctx.storage.get("image_history") or {}
history[sender] = history.get(sender, []) + [prompt]
ctx.storage.set("image_history", history)

Additional Resources

Complete Code Repository

For the complete working example with all files, visit:


Happy image generating! 🎨

If you have questions or run into issues, reach out to the Fetch.ai community on Discord or Telegram.