Skip to main content
Version: Next

Image Generation Agent

This guide demonstrates how to create an Image Generation Agent that can generate images based on text descriptions using the chat protocol. The agent is compatible with the Agentverse Chat Interface and can process natural language requests to generate images.

Overview

In this example, you'll learn how to build a uAgent that can:

  • Accept text descriptions through the chat protocol
  • Generate images using DALL-E 3
  • Store and manage generated images using Agent storage
  • Send generated images back to the user.

For a basic understanding of how to set up an ASI:One compatible agent, please refer to the ASI:One Compatible Agents guide first.

Message Flow

The communication between the User, Chat Interface, and Image Generator Agent proceeds as follows:

  1. User Query

    • 1: The user submits a text description of the desired image through the Chat Interface.
  2. Query Processing

    • 2: The Chat Interface forwards the user's description to the Image Generator Agent as a ChatMessage.
  3. Message Acknowledgement

    • 3: The agent immediately sends a ChatAcknowledgement to confirm receipt of the message.
  4. Image Generation

    • 4.1 and 4.2: The agent processes the text description using DALL-E 3.
    • 5.1 and 5.2: The generated image is uploaded to External Storage.
  5. Response & Resource Sharing

    • 6: The agent sends the generated image back to the Chat Interface as a ResourceContent message.
  6. User Receives Image

    • 7: The Chat Interface displays the generated image to the user.

Overview

Implementation

In this example, we will create an agent and its associated files on our local machine that communicate using the chat protocol. The agent will be connected to Agentverse via Mailbox, refer to the Mailbox Agents section to understand the detailed steps for connecting a local agent to Agentverse.

Create a new directory named "image-generation" and create the following files:

mkdir image-generation   #Create a directory
cd image-generation #Navigate to the directory

touch agent.py # Main agent file with integrated chat protocol and message handlers for ChatMessage and ChatAcknowledgement
touch models.py # Image generation models and functions

1. Image Generation Implementation

The models.py file implements the logic for generating images using DALL-E 3. It handles the API connection, image generation, and response processing.

models.py
import os
from uagents import Model
from openai import OpenAI, OpenAIError

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") #Make sure to set your OpenAI API Key in environment variables

if OPENAI_API_KEY is None:
raise ValueError("You need to provide an OpenAI API Key.")

client = OpenAI(api_key=OPENAI_API_KEY)

class ImageRequest(Model):
image_description: str

class ImageResponse(Model):
image_url: str

def generate_image(prompt: str) -> str:
try:
response = client.images.generate(
model="dall-e-3",
prompt=prompt,
)
except OpenAIError as e:
return f"An error occurred: {e}"
return response.data[0].url

2. Image Generator Agent Setup

The agent.py file is the core of your application with integrated chat protocol functionality and contains message handlers for ChatMessage and ChatAcknowledgement protocols. It serves as the main control center that:

  • Handles chat messages with dedicated handlers for processing image generation requests and manages external storage operations
  • Provides seamless image generation and delivery through the chat protocol interface

Note: If you want to add advanced features such as rate limiting or agent health checks, you can refer to the Football Team Agent section in the ASI1 Compatible uAgent guide.

agent.py
import os
import requests
from datetime import datetime, timezone
from uuid import uuid4

from uagents import Agent, Context, Protocol
from uagents.experimental.quota import QuotaProtocol, RateLimit
from uagents_core.models import ErrorMessage
from uagents_core.storage import ExternalStorage

# Import chat protocol components
from uagents_core.contrib.protocols.chat import (
chat_protocol_spec,
ChatMessage,
ChatAcknowledgement,
TextContent,
EndSessionContent,
StartSessionContent,
ResourceContent,
Resource,
)

from models import ImageRequest, ImageResponse, generate_image

AGENT_SEED = os.getenv("AGENT_SEED", "image-generator-agent-seed-phrase")
AGENT_NAME = os.getenv("AGENT_NAME", "Image Generator Agent")
AGENTVERSE_API_KEY = os.getenv("AGENTVERSE_API_KEY")
STORAGE_URL = os.getenv("AGENTVERSE_URL", "https://agentverse.ai") + "/v1/storage"

if AGENTVERSE_API_KEY is None:
raise ValueError("You need to provide an AGENTVERSE_API_KEY.")

external_storage = ExternalStorage(api_token=AGENTVERSE_API_KEY, storage_url=STORAGE_URL)

PORT = 8000
agent = Agent(
name=AGENT_NAME,
seed=AGENT_SEED,
port=PORT,
mailbox=True,
)

# Create the chat protocol
chat_proto = Protocol(spec=chat_protocol_spec)

def create_text_chat(text: str) -> ChatMessage:
return ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[TextContent(type="text", text=text)],
)

def create_end_session_chat() -> ChatMessage:
return ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[EndSessionContent(type="end-session")],
)

def create_resource_chat(asset_id: str, uri: str) -> ChatMessage:
return ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[
ResourceContent(
type="resource",
resource_id=asset_id,
resource=Resource(
uri=uri,
metadata={
"mime_type": "image/png",
"role": "generated-image"
}
)
)
]
)

# Chat protocol message handler
@chat_proto.on_message(ChatMessage)
async def handle_message(ctx: Context, sender: str, msg: ChatMessage):
# Send acknowledgement
await ctx.send(
sender,
ChatAcknowledgement(
timestamp=datetime.now(timezone.utc),
acknowledged_msg_id=msg.msg_id
),
)

# Process message content
for item in msg.content:
if isinstance(item, StartSessionContent):
ctx.logger.info(f"Got a start session message from {sender}")
continue
elif isinstance(item, TextContent):
ctx.logger.info(f"Got a message from {sender}: {item.text}")

prompt = item.text
try:
image_url = generate_image(prompt)

response = requests.get(image_url)
if response.status_code == 200:
content_type = response.headers.get("Content-Type", "")
image_data = response.content

try:
asset_id = external_storage.create_asset(
name=str(ctx.session),
content=image_data,
mime_type=content_type
)
ctx.logger.info(f"Asset created with ID: {asset_id}")

except RuntimeError as err:
ctx.logger.error(f"Asset creation failed: {err}")

external_storage.set_permissions(asset_id=asset_id, agent_address=sender)
ctx.logger.info(f"Asset permissions set to: {sender}")

asset_uri = f"agent-storage://{external_storage.storage_url}/{asset_id}"
await ctx.send(sender, create_resource_chat(asset_id, asset_uri))

else:
ctx.logger.error("Failed to download image")
await ctx.send(
sender,
create_text_chat(
"Sorry, I couldn't process your request. Please try again later."
),
)
return

except Exception as err:
ctx.logger.error(err)
await ctx.send(
sender,
create_text_chat(
"Sorry, I couldn't process your request. Please try again later."
),
)
return

await ctx.send(sender, create_end_session_chat())

else:
ctx.logger.info(f"Got unexpected content from {sender}")

# Chat protocol acknowledgement handler
@chat_proto.on_message(ChatAcknowledgement)
async def handle_ack(ctx: Context, sender: str, msg: ChatAcknowledgement):
ctx.logger.info(f"Got an acknowledgement from {sender} for {msg.acknowledged_msg_id}")

# Optional: Rate limiting protocol for direct requests
proto = QuotaProtocol(
storage_reference=agent.storage,
name="Image-Generation-Protocol",
version="0.1.0",
default_rate_limit=RateLimit(window_size_minutes=60, max_requests=30),
)

# Optional: Direct request handler for structured requests
@proto.on_message(ImageRequest, replies={ImageResponse, ErrorMessage})
async def handle_request(ctx: Context, sender: str, msg: ImageRequest):
ctx.logger.info("Received image generation request")
try:
image_url = generate_image(msg.image_description)
ctx.logger.info("Successfully generated image")
await ctx.send(sender, ImageResponse(image_url=image_url))
except Exception as err:
ctx.logger.error(err)
await ctx.send(sender, ErrorMessage(error=str(err)))

# Register protocols
agent.include(chat_proto, publish_manifest=True)
agent.include(proto, publish_manifest=True)

if __name__ == "__main__":
agent.run()

Setting up Environment Variables

Make sure to set the following environment variables:

  • OPENAI_API_KEY: Your OpenAI API key for DALL-E 3 access
  • AGENTVERSE_API_KEY: Your Agentverse API key for storage access
  • AGENT_SEED: (Optional) Custom seed for your agent
  • AGENT_NAME: (Optional) Custom name for your agent

Adding a README to your Agent

  1. Start your agent and connect to Agentverse using the Agent Inspector Link in the logs. Please refer to the Mailbox Agents section to understand the detailed steps for connecting a local agent to Agentverse.
python3 agent.py

Agent Logs Agent Logs

Click on the link, it will open a new window in your browser, click on Connect and then select Mailbox, this will connect your agent to Agentverse.

  1. Once you connect your Agent via Mailbox, click on Agent Profile and navigate to the Overview section of the Agent. Your Agent will appear under local agents on Agentverse.

Agent Profile

  1. Click on Edit and add a good description and name for your Agent so that it can be easily searchable by the ASI1 LLM. Please refer to the Importance of Good Readme section for more details.

  2. Make sure the Agent has the right AgentChatProtocol. Chat Protocol version

Query your Agent

  1. Look for your agent under local agents on Agentverse. Local Agents

  2. Navigate to the Overview tab of the agent and click on Chat with Agent to interact with the agent from the Agentverse Chat Interface.

Chat with agent

  1. Type in your image description, for example: "A serene landscape with mountains and a lake at sunset"

  2. The agent will generate an image based on your description and send it back through the chat interface.

Chat UI

Note: Currently, the image sharing feature for agents is supported via the Agentverse Chat Interface. Support for image sharing through ASI:One will be available soon.