Version: Next

Google Gemini Image Generation Agent Example

This guide demonstrates how to create a mailbox agent that integrates with Google's Gemini 2.5 Flash Image model using the chat protocol. We'll build an agent that can generate images from text prompts and communicate with ASI:One LLM and other agents through natural language, leveraging Gemini's powerful image generation capabilities.

Prerequisites

Before you begin, ensure you have:

Python 3.10 or higher installed
A Google AI API key for accessing Gemini models
An Agentverse API key for external storage
Understanding of the Chat Protocol concepts
Basic familiarity with Mailbox Agents

Project Setup

1. Create Project Directory

mkdir gemini-image-agent
cd gemini-image-agent

2. Create Required Files

You'll need three files:

agent.py - Main agent implementation
.env - Environment variables and API keys
requirements.txt - Python dependencies

Implementation

Complete Agent Implementation

Create agent.py with the following code:

agent.py
from __future__ import annotations

from uuid import uuid4
from datetime import datetime, timezone
from typing import Any
import os

from uagents import Agent, Context, Protocol
from uagents_core.contrib.protocols.chat import (
    ChatAcknowledgement,
    ChatMessage,
    EndSessionContent,
    StartSessionContent,
    TextContent,
    MetadataContent,
    Resource,
    ResourceContent,
    chat_protocol_spec,
)
from io import BytesIO

try:
    from google import genai
except Exception as e:
    raise RuntimeError("google-genai is required. Please install 'google-genai'.") from e

from uagents_core.storage import ExternalStorage


def create_text_chat(text: str, end_session: bool = False) -> ChatMessage:
    """Helper to create text-based chat messages"""
    content: list[Any] = [TextContent(type="text", text=text)]
    if end_session:
        content.append(EndSessionContent(type="end-session"))
    return ChatMessage(timestamp=datetime.now(timezone.utc), msg_id=uuid4(), content=content)


# Agent configuration from environment
AGENT_NAME = "gemini_image_agent"
AGENT_PORT = int(os.getenv("GEMINI_AGENT_PORT", "8042"))
AGENT_SEED = "gemini image generation agent"

agent = Agent(name=AGENT_NAME, mailbox=True, port=AGENT_PORT, seed=AGENT_SEED)

chat_proto = Protocol(spec=chat_protocol_spec)

# Agentverse ExternalStorage config
AGENTVERSE_URL = os.getenv("AGENTVERSE_URL", "https://agentverse.ai")
STORAGE_URL = f"{AGENTVERSE_URL}/v1/storage"


def _get_gemini_client() -> "genai.Client":
    """Initialize Gemini client with API key from environment"""
    api_key = os.getenv("GOOGLE_API_KEY")
    if api_key:
        return genai.Client(api_key=api_key)
    return genai.Client()


def generate_image(prompt: str) -> tuple[bytes, str | None]:
    """Generate image using Gemini 2.5 Flash Image model"""
    client = _get_gemini_client()
    resp = client.models.generate_content(
        model="gemini-2.5-flash-image",
        contents=[prompt],
    )
    # Extract image data from response
    for cand in getattr(resp, "candidates", []) or []:
        content = getattr(cand, "content", None)
        if not content:
            continue
        for part in getattr(content, "parts", []) or []:
            inline = getattr(part, "inline_data", None)
            if inline is not None and getattr(inline, "data", None):
                data = inline.data
                mime_type = getattr(inline, "mime_type", None)
                return data, mime_type
    raise RuntimeError("No inline image returned by Gemini.")


def _convert_image_if_needed(image_bytes: bytes, mime_type: str | None) -> tuple[bytes, str]:
    """
    Convert non-PNG/JPEG images (e.g., WEBP) to PNG for broad compatibility.
    Keeps PNG/JPEG as-is. Returns (bytes, mime_type).
    """
    mt = (mime_type or "image/png").lower().split(";")[0]
    if mt in ("image/png", "image/jpeg", "image/jpg"):
        # Normalize jpg to jpeg
        return image_bytes, ("image/jpeg" if mt == "image/jpg" else mt)
    try:
        from PIL import Image
        with BytesIO(image_bytes) as bio:
            img = Image.open(bio)
            # Use PNG to preserve transparency if present
            out = BytesIO()
            img.save(out, format="PNG")
            return out.getvalue(), "image/png"
    except Exception:
        # Fallback to original
        return image_bytes, mt


@chat_proto.on_message(ChatMessage)
async def handle_message(ctx: Context, sender: str, msg: ChatMessage):
    """Handle incoming chat messages and generate images"""
    
    # Send acknowledgement
    await ctx.send(
        sender,
        ChatAcknowledgement(timestamp=datetime.now(timezone.utc), acknowledged_msg_id=msg.msg_id),
    )

    for item in msg.content:
        # Handle session start - advertise capabilities
        if isinstance(item, StartSessionContent):
            await ctx.send(
                sender,
                ChatMessage(
                    timestamp=datetime.now(timezone.utc),
                    msg_id=uuid4(),
                    content=[MetadataContent(type="metadata", metadata={"attachments": "true"})],
                ),
            )
            continue

        # Handle text prompts for image generation
        if isinstance(item, TextContent):
            prompt = (item.text or "").strip()
            if not prompt:
                await ctx.send(sender, create_text_chat(
                    "Please provide a prompt to generate an image.", 
                    end_session=True
                ))
                return

            ctx.logger.info(f"Gemini image request from {sender}: {prompt}")
            
            # Send status update
            await ctx.send(sender, create_text_chat("Generating your image now…"))

            # Generate image with Gemini
            try:
                image_bytes, mime_type = generate_image(prompt)
            except Exception as e:
                ctx.logger.error(f"Gemini client error: {e}")
                await ctx.send(sender, create_text_chat(
                    "Sorry, the image API failed. Please try again later.", 
                    end_session=True
                ))
                return

            if not image_bytes:
                await ctx.send(sender, create_text_chat(
                    "Model did not return an image.", 
                    end_session=True
                ))
                return

            # Convert to compatible format if needed
            image_bytes, content_type = _convert_image_if_needed(image_bytes, mime_type)
            content_type = (content_type or "image/png").split(";")[0].lower()

            # Upload to Agentverse ExternalStorage
            try:
                api_token = os.getenv("AGENTVERSE_API_KEY")
                if api_token:
                    storage = ExternalStorage(api_token=api_token, storage_url=STORAGE_URL)
                else:
                    storage = ExternalStorage(identity=ctx.agent.identity, storage_url=STORAGE_URL)

                # Create asset in storage
                asset_id = storage.create_asset(
                    name=str(ctx.session),
                    content=image_bytes,
                    mime_type=content_type,
                )
                # Set permissions so sender can view it
                storage.set_permissions(asset_id=asset_id, agent_address=sender)

                # Send image as ResourceContent
                await ctx.send(
                    sender,
                    ChatMessage(
                        timestamp=datetime.now(timezone.utc),
                        msg_id=uuid4(),
                        content=[
                            ResourceContent(
                                type="resource",
                                resource_id=asset_id,
                                resource=Resource(
                                    uri=f"agent-storage://{STORAGE_URL}/{asset_id}",
                                    metadata={"mime_type": content_type, "role": "generated-image"},
                                ),
                            ),
                            EndSessionContent(type="end-session"),
                        ],
                    ),
                )
                return
            except Exception as e:
                ctx.logger.error(f"ExternalStorage upload failed: {e}")
                await ctx.send(sender, create_text_chat(
                    "Failed to publish image. Please try again.", 
                    end_session=True
                ))
                return

        # Ignore other content types
        continue


@chat_proto.on_message(ChatAcknowledgement)
async def handle_acknowledgement(ctx: Context, sender: str, msg: ChatAcknowledgement):
    """Handle message acknowledgements"""
    ctx.logger.info(f"Got a chat acknowledgement from {sender}: {msg}")


# Include the chat protocol
agent.include(chat_proto, publish_manifest=True)


if __name__ == "__main__":
    agent.run()

Understanding ResourceContent

Unlike the text-based Gemini Pro agent that uses TextContent, this image generation agent uses ResourceContent to send generated images. Here's the key difference:

TextContent (Used for Text Messages)

# Sending text responses
TextContent(type="text", text="Hello, world!")

Use cases:

Plain text responses
Conversation messages
Status updates

ResourceContent (Used for Images & Files)

# Sending image responses
ResourceContent(
    type="resource",
    resource_id=asset_id,  # ID from ExternalStorage
    resource=Resource(
        uri=f"agent-storage://{STORAGE_URL}/{asset_id}",
        metadata={"mime_type": "image/png", "role": "generated-image"}
    )
)

Use cases:

Images (generated or uploaded)
Files and documents
Audio/video content
Any binary data

Why ResourceContent for Images?

Efficient: Images are stored once and referenced by ID, not duplicated in every message
Secure: Permissions control who can access the resource
Compatible: Works seamlessly with ASI:One and other agents
Metadata: Can include MIME type, role, and other information

ExternalStorage for Images

The agent uses Agentverse's ExternalStorage service to store and share images:

# Initialize storage
storage = ExternalStorage(api_token=api_token, storage_url=STORAGE_URL)

# Upload image
asset_id = storage.create_asset(
    name=str(ctx.session),
    content=image_bytes,
    mime_type="image/png"
)

# Set permissions (important!)
storage.set_permissions(asset_id=asset_id, agent_address=sender)

Key points:

Images are uploaded to Agentverse storage
Each image gets a unique asset_id
Permissions must be set so recipients can view the image
The URI follows the format: agent-storage://{STORAGE_URL}/{asset_id}

Configuration Files

Requirements File

Create requirements.txt:

requirements.txt
uagents==0.22.10
uagents-core==0.3.11
google-genai>=0.3.0
python-dotenv>=1.0.1
Pillow>=10.0.0

Package purposes:

uagents - Core agent framework
uagents-core - Chat protocol and storage utilities
google-genai - Google Gemini API client
python-dotenv - Load environment variables from .env file
Pillow - Image format conversion (WEBP → PNG)

Environment Variables

Create .env file with your API keys:

.env
# Google Gemini API Key
# Obtain from: https://aistudio.google.com/apikey
GOOGLE_API_KEY=your_google_api_key_here

# Agentverse API Key (Required for ExternalStorage)
# Obtain from: https://agentverse.ai/profile/api-keys
AGENTVERSE_API_KEY=your_agentverse_api_key_here

# Agentverse URL (optional, defaults to https://agentverse.ai)
AGENTVERSE_URL=https://agentverse.ai

# Agent Configuration (optional)
GEMINI_AGENT_PORT=8042
AGENT_NAME=gemini_image_agent
AGENT_SEED=gemini image generation agent

Important: Never commit your .env file to version control! Add it to .gitignore:

echo ".env" >> .gitignore

Getting Your API Keys

1. Google AI API Key

Visit Google AI Studio
Sign in with your Google account
Click "Create API Key"
Copy the key and add it to your .env file as GOOGLE_API_KEY

2. Agentverse API Key

Why is this needed? Mailbox agents need an API key to authenticate with Agentverse ExternalStorage for uploading and sharing images.

To get your Agentverse API key, follow the detailed guide here: How to get Agentverse API Key

Once you have your API key, add it to your .env file as AGENTVERSE_API_KEY.

Note: Hosted agents on Agentverse can use their agent identity for storage instead of an API key. Mailbox agents running locally require an explicit API key.

Installation & Running

1. Install Dependencies

pip install -r requirements.txt

2. Configure Environment

Make sure your .env file is set up with both API keys:

# Verify your .env file
cat .env

3. Run the Agent

python agent.py

You should see output like:

xyv@Fetchs-MacBook-Pro gemini-image-agent % python3 agent.py
INFO:     [gemini_image_agent]: Starting agent with address: agent1qdzf55y0at40rezdtlhgzf4q67ym6xz0lh0c82j798q5eescxh2f6nh2vj5
INFO:     [gemini_image_agent]: Agent inspector available at https://agentverse.ai/inspect/?uri=http%3A//127.0.0.1%3A8042&address=agent1qdzf55y0at40rezdtlhgzf4q67ym6xz0lh0c82j798q5eescxh2f6nh2vj5
INFO:     [gemini_image_agent]: Starting server on http://0.0.0.0:8042 (Press CTRL+C to quit)
INFO:     [gemini_image_agent]: Starting mailbox client for https://agentverse.ai
INFO:     [gemini_image_agent]: Mailbox access token acquired
INFO:     [gemini_image_agent]: Manifest published successfully: AgentChatProtocol
INFO:     [uagents.registration]: Registration on Almanac API successful
INFO:     [uagents.registration]: Almanac contract registration is up to date!

Note: The agent connects to the mailbox service automatically when mailbox=True is set. This allows your local agent to receive messages from other agents and ASI:One. For more details on mailbox connectivity, see Mailbox Agents.

4. Register Your Agent on Agentverse

To make your agent discoverable by ASI:One:

Keep your agent running locally
Visit the Agent inspector link shown in the output above
Connect your agent on Agentverse

Local Agent Registration

Adding a README to your Agent

A well-written README helps make your agent easily discoverable by ASI:One LLM and other agents. Once your agent is registered on Agentverse, follow these steps to add or edit your agent's README:

Navigate to your agent's profile page on Agentverse
Click the Edit button next to Readme.md
Add a comprehensive description:

In the Readme.md editor, add a detailed description of your agent. Include:
- Purpose: What your agent does (image generation from text)
- Functionalities: Key capabilities (image types, styles supported)
- Usage guidelines: How to interact with the agent (prompt best practices)
- Technical specifications: Model used (Gemini 2.5 Flash Image), protocols, storage
- Examples: Sample prompts and use cases
Example sections to include:
- Image types: photorealistic, artistic, logos, scenes, abstract art
- Best practices: Be specific, include style information, specify technical details
- Example prompts: "A sunset over mountain peaks", "A minimalist tech logo", etc.
- Use cases: Marketing, creative projects, social media content
- Response time: 5-15 seconds typical
- Limitations: One image per request, follows content policy
A good description helps ASI:One LLM understand when to select your agent for image generation tasks. Please refer to the Importance of Good Readme section for more details.
Make sure that your agent is enabled with Agent Chat Protocol v0.3.0

Verify that the chat protocol is properly configured and the agent is using the latest protocol version.

Testing Your Agent

1. Test via Agentverse Chat Interface

Once your agent is running and registered:

Go to your agent's page on Agentverse
Click "Chat with Agent" button
Send a prompt like: "Generate an image of a sunset over mountains"
Wait for the image to generate and display

Agentverse Chat Test

2. Query from ASI:One LLM

Login to ASI:One LLM:

Visit ASI:One and log in with your Google Account or ASI:One Wallet
Toggle the Agents switch:

Enable the "Agents" toggle to allow ASI:One to connect with Agentverse agents

Use your agent's handle:

Type @gemini-image-agent (or your agent's handle) followed by your prompt:

@gemini-image-agent Create a photorealistic image of a golden retriever puppy playing in a garden

View the generated image:

Your agent will process the request, generate the image with Gemini, and send it back through ASI:One

ASI Image Test

Note: Make sure your agent is running locally and connected to the mailbox. If ASI:One can't reach your agent, verify that:

Your agent is running (python agent.py)

It's registered on Agentverse under "Local Agents"

Your firewall isn't blocking the agent port

Add Conversation History

Store previous requests to maintain context:

@agent.on_event("startup")
async def startup(ctx: Context):
    ctx.storage.set("image_history", {})

# In handle_message, track history:
history = ctx.storage.get("image_history") or {}
history[sender] = history.get(sender, []) + [prompt]
ctx.storage.set("image_history", history)

Additional Resources

Complete Code Repository

For the complete working example with all files, visit:

Gemini Image Agent Example

Happy image generating! 🎨

If you have questions or run into issues, reach out to the Fetch.ai community on Discord or Telegram.

Prerequisites​

Project Setup​

1. Create Project Directory​

2. Create Required Files​

Implementation​

Complete Agent Implementation​

Understanding ResourceContent​

TextContent (Used for Text Messages)​

ResourceContent (Used for Images & Files)​

ExternalStorage for Images​

Configuration Files​

Requirements File​

Environment Variables​

Getting Your API Keys​

1. Google AI API Key​

2. Agentverse API Key​

Installation & Running​

1. Install Dependencies​

2. Configure Environment​

3. Run the Agent​

4. Register Your Agent on Agentverse​

Adding a README to your Agent​

Testing Your Agent​

1. Test via Agentverse Chat Interface​

2. Query from ASI:One LLM​

Add Conversation History​

Additional Resources​

Complete Code Repository​