Your agent broke. Find out why.

See what your AI agent actually did — step by step, decision by decision.

Debug and control autonomous AI systems. Trace every decision, replay any run, visualize the reasoning path.

replay.opswald.com/demo
Opswald Agent Replay
Session: 2024-03-15_14:32:17
User Query
"What are the top 3 products by revenue this quarter?"

AI agents aren't API calls. Stop debugging them like they are.

Traditional LLM tools show you tokens, latency, and prompt logs. But agents are different — they plan, decide, call tools, and adapt. When something goes wrong, the failure isn't in one call. It's in the chain of decisions.

🔄

They loop

Retrying failed strategies without telling you. Your agent might be stuck in a cycle, burning tokens on the same broken approach.

🤫

They fail silently

Returning confident wrong answers. HTTP 200, valid JSON, completely wrong result. Your monitoring sees green.

🔀

They call tools in unexpected sequences

Producing results you can't reconstruct. When tools chain in ways you didn't anticipate, the output becomes untraceable.

📉

They drift

Behaving differently than last week on the same input. Model updates, context changes, and tool responses shift — silently altering behavior.

🧠

They hallucinate context

Referencing data that was never provided. Your agent fills gaps with plausible fiction, making decisions based on information it invented.

💸

They waste tokens

Sending entire conversation histories on every call. Costs balloon while context windows overflow — and you can't see where the bloat is.

Your observability tool wasn't built for this.

Trace. Replay. Understand.

📡 Opswald Trace

See every step your agent takes

Capture LLM calls, tool invocations, decisions, and observations. Automatic instrumentation via proxy or SDK — no code changes required.

  • Every LLM call with full prompt and response
  • Tool invocations with parameters and results
  • Decision points with reasoning context
  • Automatic instrumentation via proxy or SDK
Trace Timeline
00:00.00 🤖 LLM Decision: Analyze user query
00:01.23 🔧 Tool: query_database() 247ms
00:01.47 👁 Observation: 3 rows returned
00:02.89 Tool: read_file() → FileNotFound ERROR

▶️ Opswald Replay

Step through failures like a debugger

Interactive replay of any agent run. Jump to the exact step where things went wrong. See what the agent knew, what it decided, and why.

  • Step forward and backward through any run
  • Jump directly to error steps
  • See agent state at each decision point
  • Compare runs side by side
Replay Player
Step 5 of 7
◀ Prev Next ▶
❌ Tool Call Failed
read_file("quarterly_report.pdf")
FileNotFoundError: quarterly_report.pdf not found
Agent reasoning: "I should verify database results against the quarterly report for accuracy."

🔀 Opswald Graph

See why your agent chose this path

Decision flow as a navigable graph. Which observations led to which decisions. Where the agent could have gone a different way.

  • Visualize decision flow as a DAG
  • Highlight causal paths to failures
  • See alternative paths not taken
  • Critical path analysis
Decision Graph
🤖 Analyze Query
🔧 Query Database
❌ Read File
⚠️ Skip Verify
✓ Read File
✓ Verify
⚠️ Return Unverified Answer
Ghost path shows the route the agent should have taken

Two minutes to first trace.

Add one line to your agent. Every LLM call, tool invocation, and decision is automatically captured.

Or use the proxy — zero code changes. Works with OpenAI, Anthropic, Mistral, local models.

Python SDK · TypeScript SDK · Proxy · LangChain · CrewAI

import opswald

opswald.init(api_key="your-key")

# Every agent call is now traced
# automatically.
# Or use the proxy — zero code changes:
OPENAI_BASE_URL=https://proxy.opswald.com/v1

Built for agent debugging, not just logging

Feature Opswald LangSmith Langfuse Helicone
Agent run replay ~
Decision graph
Causal path analysis
Step-by-step debugging ~ ~
Multi-step trace ~
Zero-code proxy setup

Built for the people who need answers

Developer

"My agent gave the wrong answer. What happened?"

Open the run, replay it step by step, see exactly where the reasoning went wrong.

Engineering Lead

"Our agent started failing after Friday's deploy. Why?"

Compare runs before and after. The decision graph shows which path changed.

Platform Team

"We have 20 agents in production. Which ones are breaking?"

See all runs, filter by failures, spot patterns across agents.

Founder / CTO

"Can we actually trust our agents to run autonomously?"

Not yet. But with full traces and replay, you'll know exactly when you can.

Debugging is just the beginning

Trace Replay Graph Guard Trust

Today: understand what happened.

Tomorrow: control what happens next.

16
Event types auto-captured
<1ms
Overhead per event
2 min
To first trace
0
Lines of code with proxy

Built for the ecosystem

OpenAI
Anthropic
Mistral
LangChain
CrewAI
LlamaIndex
🔌 Proxy zero code
🐍 Python SDK
📘 TypeScript SDK

Start debugging in two minutes.

Connect via proxy. See your first trace. No credit card.

Read the Docs →