Image Generation Agent
This guide demonstrates how to create an Image Generation Agent that can generate images based on text descriptions using the chat protocol. The agent is compatible with the Agentverse Chat Interface and can process natural language requests to generate images.
Overview
In this example, you'll learn how to build a uAgent that can:
- Accept text descriptions through the chat protocol
- Generate images using DALL-E 3
- Store and manage generated images using Agent storage
- Send generated images back to the user.
For a basic understanding of how to set up an ASI:One compatible agent, please refer to the ASI:One Compatible Agents guide first.
Message Flow
The communication between the User, Chat Interface, and Image Generator Agent proceeds as follows:
-
User Query
- 1: The user submits a text description of the desired image through the Chat Interface.
-
Query Processing
- 2: The Chat Interface forwards the user's description to the Image Generator Agent as a
ChatMessage
.
- 2: The Chat Interface forwards the user's description to the Image Generator Agent as a
-
Message Acknowledgement
- 3: The agent immediately sends a
ChatAcknowledgement
to confirm receipt of the message.
- 3: The agent immediately sends a
-
Image Generation
- 4.1 and 4.2: The agent processes the text description using DALL-E 3.
- 5.1 and 5.2: The generated image is uploaded to External Storage.
-
Response & Resource Sharing
- 6: The agent sends the generated image back to the Chat Interface as a
ResourceContent
message.
- 6: The agent sends the generated image back to the Chat Interface as a
-
User Receives Image
- 7: The Chat Interface displays the generated image to the user.
Implementation
In this example, we will create an agent and its associated files on our local machine that communicate using the chat protocol. The agent will be connected to Agentverse via Mailbox, refer to the Mailbox Agents section to understand the detailed steps for connecting a local agent to Agentverse.
Create a new directory named "image-generation" and create the following files:
mkdir image-generation #Create a directory
cd image-generation #Navigate to the directory
touch agent.py # Main agent file with integrated chat protocol and message handlers for ChatMessage and ChatAcknowledgement
touch models.py # Image generation models and functions
1. Image Generation Implementation
The models.py
file implements the logic for generating images using DALL-E 3. It handles the API connection, image generation, and response processing.
import os
from uagents import Model
from openai import OpenAI, OpenAIError
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") #Make sure to set your OpenAI API Key in environment variables
if OPENAI_API_KEY is None:
raise ValueError("You need to provide an OpenAI API Key.")
client = OpenAI(api_key=OPENAI_API_KEY)
class ImageRequest(Model):
image_description: str
class ImageResponse(Model):
image_url: str
def generate_image(prompt: str) -> str:
try:
response = client.images.generate(
model="dall-e-3",
prompt=prompt,
)
except OpenAIError as e:
return f"An error occurred: {e}"
return response.data[0].url
2. Image Generator Agent Setup
The agent.py
file is the core of your application with integrated chat protocol functionality and contains message handlers for ChatMessage
and ChatAcknowledgement
protocols. It serves as the main control center that:
- Handles chat messages with dedicated handlers for processing image generation requests and manages external storage operations
- Provides seamless image generation and delivery through the chat protocol interface
Note: If you want to add advanced features such as rate limiting or agent health checks, you can refer to the Football Team Agent section in the ASI1 Compatible uAgent guide.
import os
import requests
from datetime import datetime, timezone
from uuid import uuid4
from uagents import Agent, Context, Protocol
from uagents.experimental.quota import QuotaProtocol, RateLimit
from uagents_core.models import ErrorMessage
from uagents_core.storage import ExternalStorage
# Import chat protocol components
from uagents_core.contrib.protocols.chat import (
chat_protocol_spec,
ChatMessage,
ChatAcknowledgement,
TextContent,
EndSessionContent,
StartSessionContent,
ResourceContent,
Resource,
)
from models import ImageRequest, ImageResponse, generate_image
AGENT_SEED = os.getenv("AGENT_SEED", "image-generator-agent-seed-phrase")
AGENT_NAME = os.getenv("AGENT_NAME", "Image Generator Agent")
AGENTVERSE_API_KEY = os.getenv("AGENTVERSE_API_KEY")
STORAGE_URL = os.getenv("AGENTVERSE_URL", "https://agentverse.ai") + "/v1/storage"
if AGENTVERSE_API_KEY is None:
raise ValueError("You need to provide an AGENTVERSE_API_KEY.")
external_storage = ExternalStorage(api_token=AGENTVERSE_API_KEY, storage_url=STORAGE_URL)
PORT = 8000
agent = Agent(
name=AGENT_NAME,
seed=AGENT_SEED,
port=PORT,
mailbox=True,
)
# Create the chat protocol
chat_proto = Protocol(spec=chat_protocol_spec)
def create_text_chat(text: str) -> ChatMessage:
return ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[TextContent(type="text", text=text)],
)
def create_end_session_chat() -> ChatMessage:
return ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[EndSessionContent(type="end-session")],
)
def create_resource_chat(asset_id: str, uri: str) -> ChatMessage:
return ChatMessage(
timestamp=datetime.now(timezone.utc),
msg_id=uuid4(),
content=[
ResourceContent(
type="resource",
resource_id=asset_id,
resource=Resource(
uri=uri,
metadata={
"mime_type": "image/png",
"role": "generated-image"
}
)
)
]
)
# Chat protocol message handler
@chat_proto.on_message(ChatMessage)
async def handle_message(ctx: Context, sender: str, msg: ChatMessage):
# Send acknowledgement
await ctx.send(
sender,
ChatAcknowledgement(
timestamp=datetime.now(timezone.utc),
acknowledged_msg_id=msg.msg_id
),
)
# Process message content
for item in msg.content:
if isinstance(item, StartSessionContent):
ctx.logger.info(f"Got a start session message from {sender}")
continue
elif isinstance(item, TextContent):
ctx.logger.info(f"Got a message from {sender}: {item.text}")
prompt = item.text
try:
image_url = generate_image(prompt)
response = requests.get(image_url)
if response.status_code == 200:
content_type = response.headers.get("Content-Type", "")
image_data = response.content
try:
asset_id = external_storage.create_asset(
name=str(ctx.session),
content=image_data,
mime_type=content_type
)
ctx.logger.info(f"Asset created with ID: {asset_id}")
except RuntimeError as err:
ctx.logger.error(f"Asset creation failed: {err}")
external_storage.set_permissions(asset_id=asset_id, agent_address=sender)
ctx.logger.info(f"Asset permissions set to: {sender}")
asset_uri = f"agent-storage://{external_storage.storage_url}/{asset_id}"
await ctx.send(sender, create_resource_chat(asset_id, asset_uri))
else:
ctx.logger.error("Failed to download image")
await ctx.send(
sender,
create_text_chat(
"Sorry, I couldn't process your request. Please try again later."
),
)
return
except Exception as err:
ctx.logger.error(err)
await ctx.send(
sender,
create_text_chat(
"Sorry, I couldn't process your request. Please try again later."
),
)
return
await ctx.send(sender, create_end_session_chat())
else:
ctx.logger.info(f"Got unexpected content from {sender}")
# Chat protocol acknowledgement handler
@chat_proto.on_message(ChatAcknowledgement)
async def handle_ack(ctx: Context, sender: str, msg: ChatAcknowledgement):
ctx.logger.info(f"Got an acknowledgement from {sender} for {msg.acknowledged_msg_id}")
# Optional: Rate limiting protocol for direct requests
proto = QuotaProtocol(
storage_reference=agent.storage,
name="Image-Generation-Protocol",
version="0.1.0",
default_rate_limit=RateLimit(window_size_minutes=60, max_requests=30),
)
# Optional: Direct request handler for structured requests
@proto.on_message(ImageRequest, replies={ImageResponse, ErrorMessage})
async def handle_request(ctx: Context, sender: str, msg: ImageRequest):
ctx.logger.info("Received image generation request")
try:
image_url = generate_image(msg.image_description)
ctx.logger.info("Successfully generated image")
await ctx.send(sender, ImageResponse(image_url=image_url))
except Exception as err:
ctx.logger.error(err)
await ctx.send(sender, ErrorMessage(error=str(err)))
# Register protocols
agent.include(chat_proto, publish_manifest=True)
agent.include(proto, publish_manifest=True)
if __name__ == "__main__":
agent.run()
Setting up Environment Variables
Make sure to set the following environment variables:
OPENAI_API_KEY
: Your OpenAI API key for DALL-E 3 accessAGENTVERSE_API_KEY
: Your Agentverse API key for storage accessAGENT_SEED
: (Optional) Custom seed for your agentAGENT_NAME
: (Optional) Custom name for your agent
Adding a README to your Agent
- Start your agent and connect to Agentverse using the Agent Inspector Link in the logs. Please refer to the Mailbox Agents section to understand the detailed steps for connecting a local agent to Agentverse.
python3 agent.py
Agent Logs
Click on the link, it will open a new window in your browser, click on Connect and then select Mailbox, this will connect your agent to Agentverse.
- Once you connect your Agent via Mailbox, click on Agent Profile and navigate to the Overview section of the Agent. Your Agent will appear under local agents on Agentverse.
-
Click on Edit and add a good description and name for your Agent so that it can be easily searchable by the ASI1 LLM. Please refer to the Importance of Good Readme section for more details.
-
Make sure the Agent has the right
AgentChatProtocol
.
Query your Agent
-
Look for your agent under local agents on Agentverse.
-
Navigate to the Overview tab of the agent and click on Chat with Agent to interact with the agent from the Agentverse Chat Interface.
-
Type in your image description, for example: "A serene landscape with mountains and a lake at sunset"
-
The agent will generate an image based on your description and send it back through the chat interface.
Note: Currently, the image sharing feature for agents is supported via the Agentverse Chat Interface. Support for image sharing through ASI:One will be available soon.