Agent Runtime¶
Overview¶
The agent runtime is the execution layer that the gateway uses to drive LLM
sessions. TriOnyx uses the Claude Agent SDK (claude-agent-sdk) as
its runtime — a Python library that provides a streaming, tool-aware interface
to Claude models. The gateway invokes this runtime for each agent session,
enforcing the agent definition as hard constraints on the SDK configuration.
The runtime is not agentic itself. It is a controlled loop: send a prompt, stream responses, intercept tool requests, execute approved tools, feed results back, repeat until done.
Runtime Stack¶
- Language: Python (>=3.11)
- Package manager: UV — handles dependency resolution and virtual environments
- LLM interface:
claude-agent-sdk— async streaming agent loop - Authentication:
CLAUDE_CODE_OAUTH_TOKENenvironment variable, read automatically by the SDK
Agent Execution Lifecycle¶
When the gateway spawns an agent, it translates the agent definition into SDK configuration and runs the agent loop:
1. Configuration¶
The gateway reads the agent definition file and constructs ClaudeAgentOptions:
from claude_agent_sdk import ClaudeAgentOptions
options = ClaudeAgentOptions(
cwd=working_directory,
system_prompt={
"type": "preset",
"preset": "claude_code",
"append": agent_definition.system_prompt,
},
allowed_tools=agent_definition.tools, # Hard boundary from definition
permission_mode="acceptEdits", # Non-interactive execution
max_turns=agent_definition.max_turns, # Prevent runaway loops
setting_sources=None, # Ignore local user settings
)
Key constraints enforced at this stage:
allowed_tools— only the tools declared in the agent definition. The SDK will not execute any tool not in this list.permission_mode="acceptEdits"— required for autonomous operation. No human-in-the-loop prompts during execution.setting_sources=None— prevents local Claude settings from overriding the gateway's configuration. The agent definition is the sole authority.
2. Invocation¶
The gateway creates a persistent client that maintains conversation context across the entire session, then sends prompts and streams the responses:
from claude_agent_sdk import ClaudeSDKClient, AssistantMessage, ResultMessage
from claude_agent_sdk import TextBlock, ToolUseBlock, UserMessage, ToolResultBlock
# Create a persistent client that maintains conversation context
client = ClaudeSDKClient(options=options)
await client.connect()
# Send a prompt and stream the response
await client.query(prompt=prompt)
async for message in client.receive_response():
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, TextBlock):
# Agent reasoning / response text
gateway.log_text(agent_id, block)
elif isinstance(block, ToolUseBlock):
# Agent requested a tool — the SDK executes it
# (only if in allowed_tools)
gateway.log_tool_use(agent_id, block)
elif isinstance(message, UserMessage):
# Tool results come back in UserMessage
for block in message.content:
if isinstance(block, ToolResultBlock):
# Gateway tracks taint based on tool results
gateway.log_tool_result(agent_id, block)
elif isinstance(message, ResultMessage):
# Single turn complete; client remains connected for next prompt
gateway.log_result(agent_id, message)
break
3. Tool Execution¶
During each agent turn, the LLM may request tool calls (Read, Write, Bash, etc.). The flow is:
- LLM emits a
ToolUseBlockrequesting a specific tool with parameters - The SDK checks the tool against
allowed_tools— rejects if not listed - If allowed, the SDK executes the tool and captures the result
- The result is fed back into the LLM context as a
UserMessagewithToolResultBlock— the persistent client handles this automatically - The LLM continues reasoning with the new information within the same turn
The gateway observes this loop (via streamed events) to add its own enforcement layer: taint tracking (marking the session as tainted if a tool returns untrusted external data) and logging (recording every tool call and result for auditability).
4. Completion¶
The agent loop ends when:
- The LLM produces a final response with no further tool calls
- The
max_turnslimit is reached - An error occurs (logged and surfaced to the gateway)
The gateway captures the ResultMessage which includes execution metadata
(duration, token usage, turn count) and records the full session transcript.
Inline Script Pattern¶
Agent scripts can be self-contained using PEP 723 inline script metadata, allowing UV to resolve dependencies without a separate project configuration:
Execution:
UV automatically creates an isolated environment, installs dependencies, and runs the script. This pattern keeps agents portable and self-describing.
Relationship to the Gateway¶
The agent runtime is a subprocess of the gateway. The gateway:
- Reads the agent definition (markdown file with frontmatter)
- Translates it into
ClaudeAgentOptions - Spawns an async Python process using
uv run - Streams and logs all messages from the agent loop
- Tracks taint based on tool results and input sources
- Records the full transcript on completion
The runtime has no awareness of security policy, taint, or risk scoring. It simply executes the loop with the constraints it was given. All security logic lives in the gateway.