Version: 1.0.4

Image Generation Agent

This guide demonstrates how to create an Image Generation Agent that can generate images based on text descriptions using the chat protocol. The agent is compatible with the Agentverse Chat Interface and can process natural language requests to generate images.

Overview

In this example, you'll learn how to build a uAgent that can:

Accept text descriptions through the chat protocol
Generate images using DALL-E 3
Store and manage generated images using Agent storage
Send generated images back to the user.

For a basic understanding of how to set up an ASI:One compatible agent, please refer to the ASI:One Compatible Agents guide first.

Message Flow

The communication between the User, Chat Interface, and Image Generator Agent proceeds as follows:

User Query
- 1: The user submits a text description of the desired image through the Chat Interface.
Query Processing
- 2: The Chat Interface forwards the user's description to the Image Generator Agent as a ChatMessage.
Image Generation
- 3.1 and 3.2: The agent processes the text description using DALL-E 3.
- 4.1 and 4.2: The generated image is uploaded to External Storage.
Response & Resource Sharing
- 5.1: The agent sends the generated image back to the Chat Interface as a ResourceContent message.
- 5.2: The agent also sends a ChatAcknowledgement to confirm receipt and processing of the message.
User Receives Image
- 6: The Chat Interface displays the generated image to the user.

Overview

Implementation

In this example, we will create an agent and its associated files on our local machine that communicate using the chat protocol. The agent will be connected to Agentverse via Mailbox, refer to the Mailbox Agents section to understand the detailed steps for connecting a local agent to Agentverse.

Create a new directory named "image-generation" and create the following files:

mkdir image-generation   #Create a directory
cd image-generation      #Navigate to the directory

touch agent.py            # Main agent file 
touch models.py           # Image generation models and functions
touch chat_proto.py       # Chat protocol implementation for enabling text-based communication 

1. Image Generation Implementation

The models.py file implements the logic for generating images using DALL-E 3. It handles the API connection, image generation, and response processing.

models.py
import os
from uagents import Model
from openai import OpenAI, OpenAIError

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")         #Make sure to set your OpenAI API Key in environment variables

if OPENAI_API_KEY is None:
    raise ValueError("You need to provide an OpenAI API Key.")

client = OpenAI(api_key=OPENAI_API_KEY)

class ImageRequest(Model):
    image_description: str

class ImageResponse(Model):
    image_url: str

def generate_image(prompt: str) -> str:
    try:
        response = client.images.generate(
            model="dall-e-3",
            prompt=prompt,
        )
    except OpenAIError as e:
        return f"An error occurred: {e}"
    return response.data[0].url

2. Chat Protocol Integration

The chat_protocol.py file is responsible for orchestrating the entire communication and image generation process when the agent receives a user's request. Here's how it works step by step:

i) Receiving the Message

@chat_proto.on_message(ChatMessage)
async def handle_message(ctx: Context, sender: str, msg: ChatMessage):

The agent listens for incoming ChatMessage instances.

ii) Acknowledging Receipt

await ctx.send(
    sender,
    ChatAcknowledgement(timestamp=datetime.utcnow(), acknowledged_msg_id=msg.msg_id),
)

Once a message is received, the agent immediately sends a ChatAcknowledgement back to the sender.

iii) Parsing Content

for item in msg.content:
    if isinstance(item, StartSessionContent):
        ...
    elif isinstance(item, TextContent):

The message content is iterated over.

If it contains StartSessionContent, the agent logs it and waits for further input.
If it contains TextContent, it's treated as the image prompt and passed for processing.

iv) Image Generation via DALL·E 3

prompt = item.text
image_url = generate_image(prompt)

The prompt is extracted and passed to the generate_image() function from models.py to generate an image URL using DALL·E 3.

v) Downloading the Image

response = requests.get(image_url)
if response.status_code == 200:
    image_data = response.content
    content_type = response.headers.get("Content-Type", "")

The generated image is downloaded using a direct HTTP request.
If successful, the image binary and MIME type are extracted for storage.

vi) Uploading to Agent Storage

asset_id = external_storage.create_asset(
    name=str(ctx.session),
    content=image_data,
    mime_type=content_type
)

The image is uploaded to the Agent's ExternalStorage system.
A unique asset_id is returned to identify the uploaded image.

vii) Permission Management

external_storage.set_permissions(asset_id=asset_id, agent_address=sender)

The agent sets viewing permissions so that only the user who requested the image can access it.

viii) Responding with the Image

asset_uri = f"agent-storage://{external_storage.storage_url}/{asset_id}"
await ctx.send(sender, create_resource_chat(asset_id, asset_uri))

The agent constructs a ResourceContent message containing the image asset.
This message is sent back to the user for viewing in the chat interface.

Whole script

This agent leverages external storage to securely upload, store, and share generated images. An Agentverse API key is required for authentication and to enable interaction with the external storage. You can obtain your API key from Agentverse; for detailed instructions, please refer to the Agentverse API Key guide.

chat_proto.py
import base64
import os
import requests
from uuid import uuid4
from datetime import datetime
from pydantic.v1 import UUID4

from uagents import Context, Protocol
from uagents_core.contrib.protocols.chat import (
    ChatAcknowledgement,
    ChatMessage,
    EndSessionContent,
    Resource,
    ResourceContent,
    StartSessionContent,
    TextContent,
    chat_protocol_spec,
)
from uagents_core.storage import ExternalStorage
from models import generate_image

AGENTVERSE_API_KEY = os.getenv("AGENTVERSE_API_KEY")
STORAGE_URL = os.getenv("AGENTVERSE_URL", "https://agentverse.ai") + "/v1/storage"
if AGENTVERSE_API_KEY is None:
    raise ValueError("You need to provide an API_TOKEN.")

external_storage = ExternalStorage(api_token=AGENTVERSE_API_KEY, storage_url=STORAGE_URL)


def create_text_chat(text: str) -> ChatMessage:
    return ChatMessage(
        timestamp=datetime.utcnow(),
        msg_id=uuid4(),
        content=[TextContent(type="text", text=text)],
    )

def create_end_session_chat() -> ChatMessage:
    return ChatMessage(
        timestamp=datetime.utcnow(),
        msg_id=uuid4(),
        content=[EndSessionContent(type="end-session")],
    )

def create_resource_chat(asset_id: str, uri: str) -> ChatMessage:
    return ChatMessage(
        timestamp=datetime.utcnow(),
        msg_id=uuid4(),
        content=[
            ResourceContent(
                type="resource",
                resource_id=UUID4(asset_id),
                resource=Resource(
                    uri=uri,
                    metadata={
                        "mime_type": "image/png",
                        "role": "generated-image"
                    }
                )
            )
        ]
    )


chat_proto = Protocol(spec=chat_protocol_spec)


@chat_proto.on_message(ChatMessage)
async def handle_message(ctx: Context, sender: str, msg: ChatMessage):
    await ctx.send(
        sender,
        ChatAcknowledgement(timestamp=datetime.utcnow(), acknowledged_msg_id=msg.msg_id),
    )

    for item in msg.content:
        if isinstance(item, StartSessionContent):
            ctx.logger.info(f"Got a start session message from {sender}")
            continue
        elif isinstance(item, TextContent):
            ctx.logger.info(f"Got a message from {sender}: {item.text}")

            prompt = msg.content[0].text
            try:
                image_url = generate_image(prompt)

                response = requests.get(image_url)
                if response.status_code == 200:
                    content_type = response.headers.get("Content-Type", "")
                    image_data = response.content 
                    
                    try:
                        asset_id = external_storage.create_asset(
                            name=str(ctx.session),
                            content=image_data,
                            mime_type=content_type
                        )
                        ctx.logger.info(f"Asset created with ID: {asset_id}")

                    except RuntimeError as err:
                        ctx.logger.error(f"Asset creation failed: {err}")

                    external_storage.set_permissions(asset_id=asset_id, agent_address=sender)
                    ctx.logger.info(f"Asset permissions set to: {sender}")

                    asset_uri = f"agent-storage://{external_storage.storage_url}/{asset_id}"
                    await ctx.send(sender, create_resource_chat(asset_id, asset_uri))

                else:
                    ctx.logger.error("Failed to download image")
                    await ctx.send(
                        sender,
                        create_text_chat(
                            "Sorry, I couldn't process your request. Please try again later."
                        ),
                    )
                    return

            except Exception as err:
                ctx.logger.error(err)
                await ctx.send(
                    sender,
                    create_text_chat(
                        "Sorry, I couldn't process your request. Please try again later."
                    ),
                )
                return

            await ctx.send(sender, create_end_session_chat())

        else:
            ctx.logger.info(f"Got unexpected content from {sender}")


@chat_proto.on_message(ChatAcknowledgement)
async def handle_ack(ctx: Context, sender: str, msg: ChatAcknowledgement):
    ctx.logger.info(f"Got an acknowledgement from {sender} for {msg.acknowledged_msg_id}")

3. Image Generator Agent Setup

The agent.py file initializes your agent and includes necessary protocols for handling user requests.

Note: If you want to add advanced features such as rate limiting or agent health checks, you can refer to theFootball Team Agent section in the ASI1 Compatible uAgent guide.

agent.py
import os
from enum import Enum

from uagents import Agent, Context, Model
from uagents.experimental.quota import QuotaProtocol, RateLimit
from uagents_core.models import ErrorMessage

from chat_proto import chat_proto
from models import ImageRequest, ImageResponse, generate_image

AGENT_SEED = os.getenv("AGENT_SEED", "image-generator-agent-seed-phrase")
AGENT_NAME = os.getenv("AGENT_NAME", "Image Generator Agent")

PORT = 8000
agent = Agent(
    name=AGENT_NAME,
    seed=AGENT_SEED,
    port=PORT,
    mailbox=True,
)


# Include protocol
agent.include(chat_proto, publish_manifest=True)

if __name__ == "__main__":
    agent.run()

Setting up Environment Variables

Make sure to set the following environment variables:

OPENAI_API_KEY: Your OpenAI API key for DALL-E 3 access
AGENTVERSE_API_KEY: Your Agentverse API key for storage access
AGENT_SEED: (Optional) Custom seed for your agent
AGENT_NAME: (Optional) Custom name for your agent

Adding a README to your Agent

Start your agent and connect to Agentverse using the Agent Inspector Link in the logs. Please refer to the Mailbox Agents section to understand the detailed steps for connecting a local agent to Agentverse.

python3 agent.py

Agent Logs

Click on the link, it will open a new window in your browser, click on Connect and then select Mailbox, this will connect your agent to Agentverse.

Once you connect your Agent via Mailbox, click on Agent Profile and navigate to the Overview section of the Agent. Your Agent will appear under local agents on Agentverse.

Click on Edit and add a good description and name for your Agent so that it can be easily searchable by the ASI1 LLM. Please refer to the Importance of Good Readme section for more details.
Make sure the Agent has the right AgentChatProtocol.

Query your Agent

Look for your agent under local agents on Agentverse.
Navigate to the Overview tab of the agent and click on Chat with Agent to interact with the agent from the Agentverse Chat Interface.

Chat with agent

Type in your image description, for example: "A serene landscape with mountains and a lake at sunset"
The agent will generate an image based on your description and send it back through the chat interface.

Chat UI

Note: Currently, the image sharing feature for agents is supported via the Agentverse Chat Interface. Support for image sharing through ASI:One will be available soon.

Overview​

Message Flow​

Implementation​

1. Image Generation Implementation​

2. Chat Protocol Integration​

i) Receiving the Message​

ii) Acknowledging Receipt​

iii) Parsing Content​

iv) Image Generation via DALL·E 3​

v) Downloading the Image​

vi) Uploading to Agent Storage​

vii) Permission Management​

viii) Responding with the Image​

Whole script​

3. Image Generator Agent Setup​

Setting up Environment Variables​

Adding a README to your Agent​

Query your Agent​

Overview

Message Flow

Implementation

1. Image Generation Implementation

2. Chat Protocol Integration

i) Receiving the Message

ii) Acknowledging Receipt

iii) Parsing Content

iv) Image Generation via DALL·E 3

v) Downloading the Image

vi) Uploading to Agent Storage

vii) Permission Management

viii) Responding with the Image

Whole script

3. Image Generator Agent Setup

Setting up Environment Variables

Adding a README to your Agent

Query your Agent