How It Works

This page explains what happens under the hood when your agent receives a task and responds — so you understand why it behaves the way it does and how to get the best results.

The agent lifecycle

Every interaction with an agent follows this lifecycle:

Receives your message — from the console, your app, a flow trigger, or a deployed channel
Plans — decides what information it needs and what steps to take
Searches its knowledge base — if you’ve connected one, it retrieves relevant context
Uses tools — takes real actions: reads from GitHub, sends a Slack message, queries a database
Reasons and responds — synthesizes everything into a clear, grounded answer or output
Streams back to you — you see the response as it’s generated, not after a long wait

Execution modes

Auteryn automatically selects the best execution mode for each task:

Mode	Best for	Speed
Standard	Quick questions, lookups, chat, short tasks	Fast (seconds)
Advanced	Complex reasoning, multi-step tasks, research, code generation	Thorough

Customer-facing and Collaboration agents always use Standard mode to keep responses fast
Internal agents use Advanced mode for complex tasks by default
You can override this in the agent settings

How the agent uses tools

When the agent needs to take an action, it calls a tool. Here’s what that looks like in practice:

Example task: “Review the open PRs in our repo and flag anything that’s been waiting more than 3 days.”

Agent calls the GitHub tool → fetches open pull requests
Agent filters PRs older than 3 days
Agent calls the Slack tool → posts a summary to the designated channel
Agent returns a response confirming what it did

Each tool call is logged and visible in the Run History tab on your agent page — so you always know exactly what the agent did.

How knowledge retrieval works

When your agent has a knowledge base, it doesn’t load all the documents every time. Instead:

Your message is converted into a semantic search query
The most relevant chunks are retrieved from your knowledge base
Those chunks are injected into the agent’s context before it responds
The agent answers based on your content — citing the source document

This means the agent only “sees” what’s relevant to the current task — you can have very large knowledge bases without performance issues.

How computer use works

When you enable Computer Use, your agent gets access to an isolated browser environment:

Agent spawns a secure browser sandbox
Navigates to the target website
Takes a screenshot and identifies UI elements
Clicks, types, scrolls, or extracts data
Reports back with results and optionally a screenshot

The browser runs in a completely isolated container. It cannot access your local machine. See Computer Use → for setup and use cases.

Real-time streaming

All agent responses are streamed in real time — you see the agent’s thinking and actions as they happen, not a single response after a long wait. This is especially useful for long-running tasks where you want to monitor progress.

In the Test panel, responses stream word by word. Tool calls appear inline as they happen.

The /api/run endpoint returns a text/event-stream response. Each SSE event has a type field: text_chunk, tool_call, run_completed, etc.