AI Agent Programming with Agno

This example is carried out on macOS in the VS Code editor. The Agno framework is used.

Prerequisites for the tutorial are:

Local Python installation
VS Code or another code editor

Installing the Python Package Manager (uv)

Installing uv (macOS & Windows)

macOS (Homebrew)

Shell

brew install uv
uv --version

Windows (PowerShell)

Shell

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
uv --version

Creating the Project Directory and uv Project

Creating the project directory

Shell

mkdir ai-agent-project

Navigate into the project

Shell

cd ai-agent-project

Initialise the project

Shell

uv init

The project directory now contains the following files:

main.py
pyproject.toml
README.md

You can quickly verify this by entering the following command in the current terminal:

Shell

ls

Once we have the desired project structure, we install the required packages:

Shell

uv add agno openai

If Python is present on the system, the existing installation is used to create a virtual environment.

The project directory should now show two changes:

.venv (New folder with all dependencies and packages for the project)
uv.lock (File that pins the dependencies/packages and versions)

We also create a .env file containing all our secret keys such as the API key for OpenAI or Gemini.

Shell

nano .env

There you create the variable OPENAI_API_KEY:

OPENAI_API_KEY=<your-key-goes-here-without-the-angle-brackets>

We now have everything needed to start with a simple agent example. For this we open VS Code with the following command (or navigate to the project folder via the VS Code interface):

Opening VS Code

Shell

code .

Programming the Agent

Static AI Agent without "Memory"

For a very simple AI agent that always has the hardcoded prompt when running, we can insert the following code into the main.py file:

Python

# Imports
from dotenv import load_dotenv
from agno.agent import Agent
from agno.models.openai import OpenAIResponses

# Load environment variables
load_dotenv()

def main():
	# Create agent with OpenAI model
	agent = Agent(
		model=OpenAIResponses(id="gpt-4o-mini"),
		markdown=True,
	)
	# Send request to agent and output response
	agent.print_response("Tell me a very funny software engineering joke", stream=True)

if __name__ == "__main__":
	main()

After running the code, this is the output on the console:

Message
> Tell me a very funny software engineering joke

Response (2.9s)
> Sure! Here's a classic:  
>  
> Why do programmers prefer dark mode?  
>  
> Because light attracts bugs! 🐛💻

Dynamic AI Agent with "Memory"

Our previous agent had no way to "remember" the prior conversation. There was also always a static prompt that we had to change in the code before running.

For this we install the following package via uv:

Shell

uv add sqlalchemy

To improve this we use a temporary SQLite database and start the agent via a CLI app (CLI = Command Line Interface, i.e. interaction via the command line).

After the first run of the code, the project directory contains the folder tmp with the file agent.db in it. We add the following attributes to the agent object:

db (Specifies which database we use and where it is located)
add_history_to_context (Tells the agent that the conversation history should be included in every new prompt)
num_history_runs (Maximum number of recent conversation turns to include in the context)

Python

# Imports
from dotenv import load_dotenv
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses

# Load environment variables
load_dotenv()

def main():
	# Create agent with OpenAI model, database and chat history
	agent = Agent(
		model=OpenAIResponses(id="gpt-4o-mini"),
		db=SqliteDb(db_file="tmp/agent.db"),
		add_history_to_context=True,
		num_history_runs=5,
		markdown=True,
		stream=True
	)

# Start interactive CLI
agent.cli_app(stream=True)
  
if __name__ == "__main__":
	main()

Once you start the small application you can have conversations like this:

Markdown

Message
> Hi, ich bin Marios

Response (2.4s)
> Hallo Marios! Wie kann ich dir heute helfen?

Message
> Wie lautet mein Name?

Response (2.1s)
> Dein Name ist Marios. Wie kann ich dir weiterhelfen?

Message
> Erzähle mir einen lustigen Witz und verwende dabei meinen Namen

Response (1.7s)
> Klar, hier ist ein Witz mit deinem Namen:   
> Warum hat Marios einen GPS-Tracker in die Küche gelegt?  
> Weil er immer wieder in die "Maro-lokation" von Snacks verlorenging! 😂  
> Hoffe, das hat dir ein Lächeln ins Gesicht gezaubert!

Tools for the AI Agent

DuckDuckGo Tool

So that our agent has more capabilities than a simple chatbot, we can integrate so-called tools via Agno.

Full list of tools: https://docs.agno.com/tools/toolkits/overview

For our next example we choose the tool for the DuckDuckGo search engine. We install it via uv:

Shell

uv add duckduckgo-search ddgs

We pass the available tools to the agent via the tools attribute.

Python

# Imports
from dotenv import load_dotenv
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses
from agno.tools.duckduckgo import DuckDuckGoTools

# Load environment variables
load_dotenv()

def main():
	# Create agent with OpenAI model, database, chat history and tools
    agent = Agent(
        model=OpenAIResponses(id="gpt-4o-mini"),
        db=SqliteDb(db_file="tmp/agent.db"),
        tools=[DuckDuckGoTools()],
        add_history_to_context=True,
        num_history_runs=5,
        markdown=True,
    )
    agent.cli_app(stream=True)

if __name__ == "__main__":
    main()

Below you can see what a possible conversation looks like after running the app. From the flow you can see that the AI model independently decided to use the DuckDuckGo tool to get the latest information.

Markdown

😎 User: What's the latest news on OpenAI? Summarise briefly in 3 sentences.

Message
> What's the latest news on OpenAI? Summarise briefly in 3 sentences.

Tool Calls
- search_news(query="OpenAI", max_results=5)

Response (8.7s)
Here are the latest developments on OpenAI:

1. OpenAI hat eine neue Version seines Codex-Tools eingeführt, die auf speziellen Chips basiert, was als erster Meilenstein in der Zusammenarbeit mit einem Chiphersteller gilt. [Artikel lesen](https://www.msn.com/en-us/news/technology/a-new-version-of-openai-s-codex-is-powered-by-a-new-dedicated-chip/ar-AA1WepWX).

2. Greg Brockman, der Präsident von OpenAI, hat Millionen an Donald Trump gespendet, um die Mission von OpenAI zu unterstützen, was bei einigen Mitarbeitern auf Unmut stößt. [Artikel lesen](https://www.wired.com/story/openai-president-greg-brockman-political-donations-trump-humanity/).

3. OpenAI hat die Cerebras-Chips für das neue Modell GPT-5.3 Codex Spark eingesetzt, um die Abhängigkeit von Nvidia zu verringern und die Generierungsgeschwindigkeit um das 15-fache zu steigern. [Artikel lesen](https://venturebeat.com/technology/openai-deploys-cerebras-chips-for-15x-faster-code-generation-in-first-major).

Arxiv Tool

Another interesting tool that can be used especially in a scientific context is the Arxiv tool.

We install the dependencies/packages via uv as before:

Shell

uv add arxiv pypdf

For better context we set our agent's role with the instructions attribute. We also add ArxivTools to the tools array.

Full documentation for the tool can be found here: https://docs.agno.com/tools/toolkits/search/arxiv

Python

# Imports
from dotenv import load_dotenv
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.tools.arxiv import ArxivTools

# Load environment variables
load_dotenv()

def main():
	# Agent mit OpenAI Modell, Datenbank, Chat History und Tools erstellen
	agent = Agent(
		model=OpenAIResponses(id="gpt-4o-mini"),
		db=SqliteDb(db_file="tmp/agent.db"),
		tools=[DuckDuckGoTools(), ArxivTools()],
		instructions=[
		"You are a research assistant.",
		"Use ArxivTools to search and summarise academic papers or DuckDuckGo for search.",
		],
		add_history_to_context=True,
		num_history_runs=5,
		markdown=True,
	)
	agent.cli_app(stream=True)

if __name__ == "__main__":
	main()

A conversation might look like this:

Markdown

😎 User: Search for papers on LLMs and AI agents and summarise briefly.

Message
> Search for papers on LLMs and AI agents and summarise briefly.

Tool Calls
- search_arxiv_and_return_articles(query="Large Language Models AI Agents", num_articles=5)
- search_arxiv_and_return_articles(query="LLMs AI Agents", num_articles=5)

Response (206.4s)
Here are some recent papers on *Large Language Models (LLMs) and AI agents*:

1. **Unmasking the Shadows of AI: Investigating Deceptive Capabilities in Large Language Models**  
   Author: Linge Guo  
   Summary: Investigates deceptive behaviour in LLMs and identifies four types of deception (strategic deception, imitation, flattery, "unfaithful thinking"), including social implications and possible countermeasures.  
   Links: http://arxiv.org/abs/2403.09676v1 · https://arxiv.org/pdf/2403.09676v1

2. **Is Self-knowledge and Action Consistent or Not: Investigating Large Language Model's Personality**  
   Authors: Yiming Ai et al.  
   Summary: Examines whether classic personality tests are meaningful for LLMs and whether "claimed" personality traits are consistent with observable behaviour.  
   Links: http://arxiv.org/abs/2402.14679v2 · https://arxiv.org/pdf/2402.14679v2

3. **Enhancing Human-Like Responses in Large Language Models**  
   Authors: Ethem Yağız Çalık, Talha Rüzgar Akkuş  
   Summary: Looks at methods to make LLM outputs more human-like (e.g. more natural language, emotional intelligence) and uses fine-tuning to improve human-AI interaction.  
   Links: http://arxiv.org/abs/2501.05032v2 · https://arxiv.org/pdf/2501.05032v2

4. **Knowledge-Driven Agentic Scientific Corpus Distillation Framework for Biomedical Large Language Models Training**  
   Authors: Meng Xiao et al.  
   Summary: Proposes an agentic multi-agent approach to distill high-quality training data from biomedical literature and reduce dependence on expensive/scarce annotations.  
   Links: http://arxiv.org/abs/2504.19565v3 · https://arxiv.org/pdf/2504.19565v3

5. **Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models**  
   Authors: Paul Darm, Annalisa Riccardi  
   Summary: Shows that targeted interventions at inference time (e.g. on certain attention heads) can affect or bypass safety mechanisms and trigger undesired coordination or behaviour changes.  
   Links: http://arxiv.org/abs/2502.05945v3 · https://arxiv.org/pdf/2502.05945v3

Custom Tools for the Agent

Often the built-in tools are not enough for specific business/use cases. In addition to the predefined tools provided by Agno, we can create our own tools and pass them to the agent. The agent then decides on its own whether to use them or not.

Next we create a custom tool that specifically accesses Hacker News to find and summarise the latest tech trends for us.

We don't need any new packages for this, but we do need to import json and httpx for API calls and tool. For the custom tool there is the special annotation @tool().

More information can be found in the documentation: https://docs.agno.com/cookbook/tools/custom-tools

We implement this function in our code:

Python

# Imports
from dotenv import load_dotenv
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.tools.arxiv import ArxivTools
import httpx
import json
from agno.tools import tool

# Load environment variables
load_dotenv()

@tool()  
def get_top_hackernews_stories(num_stories: int = 5) -> str:  
	"""Fetch top stories from Hacker News.  
	  
	Args:  
	num_stories (int): Number of stories to retrieve.  
	"""  
	response = httpx.get("https://hacker-news.firebaseio.com/v0/topstories.json")  
	story_ids = response.json()  
	  
	stories = []  
	for story_id in story_ids[:num_stories]:  
		story_response = httpx.get(  
		f"https://hacker-news.firebaseio.com/v0/item/{story_id}.json"  
		)  
		story = story_response.json()  
		story.pop("text", None)  
		stories.append(story)  
  
return json.dumps(stories)

def main():
	# Agent mit OpenAI Modell, Datenbank, Chat History und Tools erstellen
	agent = Agent(
		model=OpenAIResponses(id="gpt-4o-mini"),
		db=SqliteDb(db_file="tmp/agent.db"),
		tools=[DuckDuckGoTools(), ArxivTools(), get_top_hackernews_stories],
		instructions=["You are a Tech News assistant. Use the HackerNews tool to find current stories."],
		add_history_to_context=True,
		num_history_runs=5,
		markdown=True,
	)
	agent.cli_app(stream=True)

if __name__ == "__main__":
	main()

We start the app and chat with the agent:

Markdown

😎 User: What tech news is there? Summarise very briefly

Message
> What tech news is there? Summarise very briefly

Tool Calls
- get_top_hackernews_stories(num_stories=5)

Response (11.9s)
Here are the current tech news from Hacker News:

1. **Zed Editor switches graphics libraries from blade to wgpu**  
   Brief: Major rendering update aimed at improving performance and compatibility.  
   Link: https://github.com/zed-industries/zed/pull/46758

2. **Monosketch**  
   Brief: New, simple sketching/drafting platform focused on straightforward drawing.  
   Link: https://monosketch.io/

3. **Open Source Is Not About You (2018)**  
   Brief: Essay/reminder on the philosophy and mindset behind open source.  
   Link: https://gist.github.com/richhickey/1563cddea1002958f96e7ba9519972d9

4. **Green’s Dictionary of Slang – 500 years of colloquial language**  
   Brief: Comprehensive online reference for slang terms and their history.  
   Link: https://greensdictofslang.com/

5. **Faster than Dijkstra?**  
   Brief: Discussion of alternative shortest-path approaches that might beat Dijkstra.  
   Link: https://systemsapproach.org/2026/02/09/faster-than-dijkstra/

Exposing the Agent via an API

To access the agent via an interface we use Agno's AgentOS framework, which automatically generates the endpoints for us. For structured responses we also use FastAPI with a response schema.

More information on the approach can be found in the AgentOS documentation: https://docs.agno.com/agent-os/run-your-os

First we install the dependencies with uv:

Shell

uv add "fastapi[standard]" sqlalchemy PyJWT

Then we use the following code:

Python

# Imports
from dotenv import load_dotenv
from fastapi import FastAPI
from pydantic import BaseModel
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.tools.arxiv import ArxivTools
from agno.os import AgentOS
from agno.tools import tool
import httpx
import json

# Load environment variables
load_dotenv()

@tool()
def get_top_hackernews_stories(num_stories: int = 5) -> str:
    """Fetch top stories from Hacker News.

    Args:
        num_stories (int): Number of stories to retrieve.
    """
    response = httpx.get("https://hacker-news.firebaseio.com/v0/topstories.json")
    story_ids = response.json()

    stories = []
    for story_id in story_ids[:num_stories]:
        story_response = httpx.get(
            f"https://hacker-news.firebaseio.com/v0/item/{story_id}.json"
        )
        story = story_response.json()
        story.pop("text", None)
        stories.append(story)

    return json.dumps(stories)

# Agent erstellen
agent = Agent(
    id="tech-agent",
    model=OpenAIResponses(id="gpt-4o-mini"),
    db=SqliteDb(db_file="tmp/agent.db"),
    tools=[DuckDuckGoTools(), ArxivTools(), get_top_hackernews_stories],
    instructions=["You are a Tech News assistant. Use the HackerNews tool to find current stories."],
    add_history_to_context=True,
    num_history_runs=5,
    markdown=True,
)

# Custom FastAPI App
custom_app = FastAPI(title="Tech Agent API")

# Request Schema
class ChatRequest(BaseModel):
    message: str
    session_id: str | None = None

# Clean chat endpoint
@custom_app.post("/chat")
def chat(request: ChatRequest):
    response = agent.run(request.message, session_id=request.session_id)
    return {
        "response": response.content,
        "session_id": response.session_id,
    }

# AgentOS mit Custom App als base_app
agent_os = AgentOS(
    agents=[agent],
    base_app=custom_app,
)

app = agent_os.get_app()

if __name__ == "__main__":
    agent_os.serve(app="main:app", reload=True)

We then start the server with:

Shell

python main.py

The agent is available locally at the following URLs:

Shell

POST http://localhost:7777/agents/tech-agent/runs

The auto-generated API documentation is available at:

Shell

http://localhost:7777/docs

Sessions can be managed via this URL:

Shell

GET http://localhost:7777/sessions

Additional endpoint /chat:

Shell

POST http://localhost:7777/chat

With AgentOS we automatically have access to:

Runs
Sessions
Memory
etc.

We now test access to our agent. There are two options:

HTTP client (e.g. Postman)
curl

We do this using curl and send the following request:

Shell

curl -X POST http://localhost:7777/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What are the top Hacker News stories?"}'

From our API we get the following response:

Json

{"response":"Hier sind die aktuellen Top-Stories von HackerNews:\n\n1. **[GPT-5.2 derives a new result in theoretical physics](https://openai.com/index/new-result-theoretical-physics/)**  \n   von davidbarker  \n   - Punkte: 28  \n   - Kommentare: 2  \n   - Zeit: 2023-12-12\n\n2. **[Apple, fix my keyboard before the timer ends or I'm leaving iPhone](https://ios-countdown.win/)**  \n   von ozzyphantom  \n   - Punkte: 783  \n   - Kommentare: 402  \n   - Zeit: 2023-12-12\n\n3. **[Monosketch](https://monosketch.io/)**  \n   von penguin_booze  \n   - Punkte: 539  \n   - Kommentare: 108  \n   - Zeit: 2023-12-12\n\n4. **[Sandwich Bill of Materials](https://nesbitt.io/2026/02/08/sandwich-bill-of-materials.html)**  \n   von zdw  \n   - Punkte: 92  \n   - Kommentare: 6  \n   - Zeit: 2023-12-12\n\n5. **[Open Source Is Not About You (2018)](https://gist.github.com/richhickey/1563cddea1002958f96e7ba9519972d9)**  \n   von doubleg  \n   - Punkte: 154  \n   - Kommentare: 100  \n   - Zeit: 2023-12-12\n\nFalls du mehr Informationen zu einer spezifischen Story möchtest, lass es mich wissen!",
"session_id":"b6212889-e8d1-450f-bc9d-5f4b344fd405"}

Agent Frontend

So that we don't always have to enter data via the console and work with raw JSON. For this we use the lightweight and quick-to-set-up Streamlit frontend.

Enter the following commands in the terminal:

Shell

uv add streamlit requests python-dotenv

We create the frontend folder and the file app.py inside it:

Shell

mkdir frontend
nano app.py

There we add the following code, which serves as the chat interface:

Python

import streamlit as st
import requests
import json
import uuid

# Configuration
st.set_page_config(
	page_title="AI Agent Chat",
	page_icon="🤖",
	layout="wide"
)

st.title("🤖 AI Agent Chat")
st.markdown("---")

# Backend URL
API_URL = "http://localhost:7777"

# Initialize session state
if "messages" not in st.session_state:
	st.session_state.messages = []
	
if "session_id" not in st.session_state:
	st.session_state.session_id = str(uuid.uuid4())

# Display chat history
for message in st.session_state.messages:
	with st.chat_message(message["role"]):
		st.markdown(message["content"])

  

# Chat input
if user_input := st.chat_input("Type your message..."):
	# Add user message to history
	st.session_state.messages.append({
		"role": "user",
		"content": user_input
	})

	# Display user message
	with st.chat_message("user"):
		st.markdown(user_input)

	# Get response from backend
	with st.chat_message("assistant"):
		try:
			response = requests.post(
				f"{API_URL}/chat",
				json={
					"message": user_input,
					"session_id": st.session_state.session_id
				},
			timeout=60
			)
			response.raise_for_status()
		
			result = response.json()
	
			# Display only the response content
			response_text = result.get("response", "No response")
			st.markdown(response_text)
	
			# Add to history
			st.session_state.messages.append({
				"role": "assistant",
				"content": response_text
			})

		except requests.exceptions.ConnectionError:
			st.error(f"❌ Cannot connect to backend at {API_URL}")
		except requests.exceptions.Timeout:
			st.error("⏱️ Request timed out")
		except Exception as e:
		st.error(f"❌ Error: {str(e)}")

We now need to start both applications, for which we need two terminals.

For our API:

Shell

python main.py

For the frontend:

Shell

cd frontend
streamlit run app.py

Multi-Agent System

We now have an agent including a frontend that is very specialised in tech news. But we would like an agent that can decide for itself which other specialised agents to call. So an orchestrator that decides who is best suited.

For this Agno offers the possibility of Teams, where a central agent acts as coordinator.

First we install one more dependency/package via uv:

Shell

uv add yfinance

We then replace the code in main.py with the following:

Python

# Imports
from dotenv import load_dotenv
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List, Optional
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.tools.arxiv import ArxivTools
from agno.tools.yfinance import YFinanceTools
from agno.tools.hackernews import HackerNewsTools
from agno.team import Team
from agno.os import AgentOS

# Load environment variables
load_dotenv()

db = SqliteDb(db_file="tmp/team.db")

# === Create agents ===

news_agent = Agent(
    name="News Agent",
    role="You find current tech news on Hacker News.",
    model=OpenAIResponses(id="gpt-4o-mini"),
    tools=[HackerNewsTools()],
    instructions=["Find current news and summarise them."],
)

research_agent = Agent(
    name="Research Agent",
    role="You search for academic papers on arXiv.",
    model=OpenAIResponses(id="gpt-4o-mini"),
    tools=[ArxivTools()],
    instructions=["Search for relevant academic papers and summarise the results."],
)

stock_agent = Agent(
    name="Stock Agent",
    role="You analyse stock prices and financial data.",
    model=OpenAIResponses(id="gpt-4o-mini"),
    tools=[YFinanceTools()],
    instructions=["Provide current stock prices, analyst recommendations and financial data."],
)

web_agent = Agent(
    name="Web Search Agent",
    role="You search the web for current information.",
    model=OpenAIResponses(id="gpt-4o-mini"),
    tools=[DuckDuckGoTools()],
    instructions=["Search the web for relevant information."],
)

# === Create team ===

team = Team(
    id="research-team",
    name="Research Team",
    mode="coordinate",
    model=OpenAIResponses(id="gpt-4o-mini"),
    members=[news_agent, research_agent, stock_agent, web_agent],
    instructions=[
        "You are a coordinator who delegates requests to the right team members.",
        "Use the News Agent for current tech news.",
        "Use the Research Agent for academic research.",
        "Use the Stock Agent for stock and financial data.",
        "Use the Web Search Agent for general web searches.",
        "Summarise the results from all agents involved.",
    ],
    db=db,
    add_history_to_context=True,
    num_history_runs=5,
    store_member_responses=True,
    share_member_interactions=True,
    markdown=True,
)

# === Custom FastAPI App ===

custom_app = FastAPI(title="Research Team API")

class ChatRequest(BaseModel):
    message: str
    session_id: str | None = None

class AgentInfo(BaseModel):
    name: str
    role: str
    content: str

class ChatResponse(BaseModel):
    response: str
    session_id: Optional[str]
    agents_used: List[AgentInfo]

@custom_app.post("/chat", response_model=ChatResponse)
def chat(request: ChatRequest):
    response = team.run(request.message, session_id=request.session_id)

    # Extract involved agents from member_responses
    agents_used = []
    if response.member_responses:
        for member_response in response.member_responses:
            agent_name = getattr(member_response, "agent_id", None) or "Unknown"
            agents_used.append(AgentInfo(
                name=agent_name,
                role="Team Member",
                content=member_response.content or "",
            ))

    return ChatResponse(
        response=response.content,
        session_id=response.session_id,
        agents_used=agents_used,
    )

# === AgentOS mit Custom App ===

agent_os = AgentOS(
    agents=[news_agent, research_agent, stock_agent, web_agent],
    teams=[team],
    base_app=custom_app,
)

app = agent_os.get_app()

if __name__ == "__main__":
    agent_os.serve(app="main:app", reload=True)

We now adapt the frontend accordingly:

Python

import streamlit as st
import requests
import json
import uuid
# Page config
st.set_page_config(
    page_title="AI Agent Chat",
    page_icon="🤖",
    layout="wide"
)
st.title("🤖 AI Agent Chat")
st.markdown("---")
# Backend URL
API_URL = "http://localhost:7777"
# Initialize session state
if "messages" not in st.session_state:
    st.session_state.messages = []
if "session_id" not in st.session_state:
    st.session_state.session_id = str(uuid.uuid4())
# Display chat history
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])
        # Show agents if available
        if "agents" in message and message["agents"]:
            with st.expander("👥 Agents involved"):
                for agent in message["agents"]:
                    st.markdown(f"**{agent['name']}** ({agent['role']})")
# Chat input
if user_input := st.chat_input("Type your message..."):
    # Add user message to history
    st.session_state.messages.append({
        "role": "user",
        "content": user_input,
        "agents": []
    })
    
    # Display user message
    with st.chat_message("user"):
        st.markdown(user_input)
    
    # Get response from backend
    with st.chat_message("assistant"):
        try:
            response = requests.post(
                f"{API_URL}/chat",
                json={
                    "message": user_input,
                    "session_id": st.session_state.session_id
                },
                timeout=60
            )
            response.raise_for_status()
            
            result = response.json()
            
            # Display the response content
            response_text = result.get("response", "No response")
            st.markdown(response_text)
            
            # Display agents used
            agents_used = result.get("agents_used", [])
            if agents_used:
                with st.expander("👥 Agents involved"):
                    for agent in agents_used:
                        st.markdown(f"**{agent['name']}** - {agent['role']}")
            
            # Add to history
            st.session_state.messages.append({
                "role": "assistant",
                "content": response_text,
                "agents": agents_used
            })
            
        except requests.exceptions.ConnectionError:
            st.error(f"❌ Cannot connect to backend at {API_URL}")
        except requests.exceptions.Timeout:
            st.error("⏱️ Request timed out")
        except Exception as e:
            st.error(f"❌ Error: {str(e)}")

We can now start both applications again and send appropriate prompts. After each response a dropdown menu shows which of the available agents was used.

Implementing STT and TTS

So that you can talk to the agents using natural speech as well as typing, here is one more frontend implementation.

Python

import streamlit as st
import requests
import json
import uuid
import os
from pathlib import Path
from dotenv import load_dotenv
from openai import OpenAI
# Load .env from parent directory
env_path = Path(__file__).parent.parent / ".env"
load_dotenv(env_path)
# Page config
st.set_page_config(
    page_title="AI Agent Chat",
    page_icon="🤖",
    layout="wide"
)
st.title("🤖 AI Agent Chat")
st.markdown("---")
# Backend URL
API_URL = "http://localhost:7777"
# OpenAI Client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Initialize session state
if "messages" not in st.session_state:
    st.session_state.messages = []
if "session_id" not in st.session_state:
    st.session_state.session_id = str(uuid.uuid4())
if "input_mode" not in st.session_state:
    st.session_state.input_mode = "text"
if "last_audio_played" not in st.session_state:
    st.session_state.last_audio_played = None
# Display chat history (WITHOUT audio - audio will be shown separately)
for i, message in enumerate(st.session_state.messages):
    with st.chat_message(message["role"]):
        st.markdown(message["content"])
        # Show agents if available
        if "agents" in message and message["agents"]:
            with st.expander("👥 Agents involved"):
                for agent in message["agents"]:
                    st.markdown(f"**{agent['name']}** ({agent['role']})")
# Input mode toggle
st.markdown("### 📝 Input Mode")
col1, col2 = st.columns(2)
with col1:
    if st.button("⌨️ Text Input", use_container_width=True, 
                 type="primary" if st.session_state.input_mode == "text" else "secondary"):
        st.session_state.input_mode = "text"
        st.rerun()
with col2:
    if st.button("🎤 Voice Input", use_container_width=True,
                 type="primary" if st.session_state.input_mode == "voice" else "secondary"):
        st.session_state.input_mode = "voice"
        st.rerun()
st.markdown("---")
# Display selected input method
user_input = None
if st.session_state.input_mode == "text":
    text_input = st.chat_input("Type your message...")
    user_input = text_input
else:  # voice mode
    audio_input = st.audio_input("Record your message", key="audio_input")
    
    if audio_input:
        try:
            # Convert audio to text with Whisper
            with st.spinner("🎯 Transcribing..."):
                transcript = client.audio.transcriptions.create(
                    model="whisper-1",
                    file=("audio.wav", audio_input, "audio/wav")
                )
                user_input = transcript.text
                st.success(f"Transcribed: {user_input}")
        except Exception as e:
            st.error(f"❌ Transcription failed: {type(e).__name__}: {str(e)}")
            user_input = None
# Process input
if user_input:
    # Add user message to history
    st.session_state.messages.append({
        "role": "user",
        "content": user_input,
        "agents": []
    })
    
    # Display user message
    with st.chat_message("user"):
        st.markdown(user_input)
    
    # Get response from backend
    with st.chat_message("assistant"):
        try:
            response = requests.post(
                f"{API_URL}/chat",
                json={
                    "message": user_input,
                    "session_id": st.session_state.session_id
                },
                timeout=180
            )
            response.raise_for_status()
            result = response.json()
            
            # Display the response content
            response_text = result.get("response", "No response")
            st.markdown(response_text)
            
            # Display agents used
            agents_used = result.get("agents_used", [])
            if agents_used:
                with st.expander("👥 Agents involved"):
                    for agent in agents_used:
                        st.markdown(f"**{agent['name']}** - {agent['role']}")
            
            # Convert response to speech
            with st.spinner("🔊 Generating audio..."):
                audio_response = client.audio.speech.create(
                    model="gpt-4o-mini-tts-2025-12-15",
                    voice="echo",
                    input=response_text
                )
                audio_bytes = audio_response.content
                st.audio(audio_bytes, format="audio/mpeg", autoplay=True)
            
            # Add to history
            st.session_state.messages.append({
                "role": "assistant",
                "content": response_text,
                "agents": agents_used
            })
            
        except requests.exceptions.ConnectionError:
            st.error(f"❌ Cannot connect to backend at {API_URL}")
        except requests.exceptions.Timeout:
            st.error("⏱️ Request timed out")
        except Exception as e:
            st.error(f"❌ Error: {type(e).__name__}: {str(e)}")
# Show continue input if chat exists
if st.session_state.messages:
    st.markdown("---")
    st.write("💬 Continue the conversation:")
    
    if st.session_state.input_mode == "text":
        continue_input = st.chat_input("Type your next message...")
    else:  # voice mode
        continue_input = None
        continue_audio = st.audio_input("Record your next message", key="continue_audio")
        if continue_audio:
            try:
                with st.spinner("🎯 Transcribing..."):
                    transcript = client.audio.transcriptions.create(
                        model="whisper-1",
                        file=("audio.wav", continue_audio, "audio/wav")
                    )
                    continue_input = transcript.text
            except Exception as e:
                st.error(f"❌ Transcription failed: {str(e)}")
    
    # Process continue input
    if continue_input:
        # Add user message
        st.session_state.messages.append({
            "role": "user",
            "content": continue_input,
            "agents": []
        })
        
        # Display user message
        with st.chat_message("user"):
            st.markdown(continue_input)
        
        # Get response
        with st.chat_message("assistant"):
            try:
                response = requests.post(
                    f"{API_URL}/chat",
                    json={
                        "message": continue_input,
                        "session_id": st.session_state.session_id
                    },
                    timeout=180
                )
                response.raise_for_status()
                result = response.json()
                
                # Display response
                response_text = result.get("response", "No response")
                st.markdown(response_text)
                
                # Display agents
                agents_used = result.get("agents_used", [])
                if agents_used:
                    with st.expander("👥 Agents involved"):
                        for agent in agents_used:
                            st.markdown(f"**{agent['name']}** - {agent['role']}")
                
                # Generate and play audio
                with st.spinner("🔊 Generating audio..."):
                    audio_response = client.audio.speech.create(
                        model="gpt-4o-mini-tts-2025-12-15",
                        voice="echo",
                        input=response_text
                    )
                    audio_bytes = audio_response.content
                    st.audio(audio_bytes, format="audio/mpeg", autoplay=True)
                
                # Add to history
                st.session_state.messages.append({
                    "role": "assistant",
                    "content": response_text,
                    "agents": agents_used
                })
                
            except Exception as e:
                st.error(f"❌ Error: {type(e).__name__}: {str(e)}")

I hope everyone who read the post or even coded along enjoyed it and learned something new. AI agents are still a relatively new field that will accompany us more and more in the future, so it's important to stay on the ball and keep acquiring new knowledge.

For those who don't want to or can't code along step by step, here is the full repository: https://github.com/DFT-IT/agno-ai-agent-project.git

Until next time! :) -- Marios Tzialidis