Skip to main content
Helios is composed of five core components that work together to execute and verify agent tasks.

Task

A Task is a directory containing everything an agent needs to complete a goal.
my-task/
├── instruction.md      # Natural language instructions
├── task.toml           # Configuration (timeouts, resources, Docker image)
├── environment/        # Optional: custom Dockerfile
│   └── Dockerfile
└── tests/              # Verification scripts
    └── test.sh

instruction.md

Natural language description of what the agent should accomplish. This becomes part of the agent’s prompt.

task.toml

Configuration including Docker image, GUI mode, timeouts, and resource limits.

Learn more about Tasks

See how to create and configure tasks

Gateway

The Gateway is the abstraction layer over LLM providers. It routes requests to the appropriate backend based on the model identifier.
Model PatternProvider
gemini/* or contains geminiGoogle Gemini
claude-*Anthropic Direct
bedrock/* or contains anthropic.AWS Bedrock
openai/* or computer-use-previewOpenAI
# The gateway automatically routes based on model name
helios tasks/my-task -m claude-sonnet-4-20250514  # → Anthropic
helios tasks/my-task -m openai/computer-use-preview  # → OpenAI
helios tasks/my-task -m gemini/gemini-2.5-computer-use-preview-10-2025  # → Gemini

Model Configuration

See all supported models and how to configure them

Environment

The Environment provisions isolated containers for agent execution. Helios supports two providers:

Docker (Local)

Run containers locally using Docker. Supports headless and GUI modes with VNC.

Daytona (Cloud)

Run in cloud sandboxes for scalable, distributed execution.

Execution Modes

ModeDescriptionUse Case
HeadlessNo GUI, bash and editor tools onlyCLI tasks, scripting, file manipulation
GUIFull desktop with VNCBrowser automation, desktop apps, visual tasks

Environment Configuration

Learn about Docker images, GUI mode, and cloud providers

Tools

Tools are the interfaces agents use to interact with the environment.
Execute shell commands inside the container. Available in all modes.
# Examples
echo "hello" > /home/hello.txt
apt-get update && apt-get install -y curl
python script.py
Make structured file edits. Useful for multi-line changes where shell commands would be brittle.
  • View file contents
  • Insert or replace text
  • Create new files
Mouse and keyboard interactions. Only available in GUI mode.
ActionDescription
left_clickClick at coordinates
right_clickRight-click
double_clickDouble-click
triple_clickSelect line
typeType text
keyPress key (e.g., Return, ctrl+c)
scrollScroll up/down
screenshotCapture screen

Tools Reference

Complete tool documentation and examples

Verifier

The Verifier runs tests/test.sh inside the container after the agent completes (or times out) and checks the outcome.

Reward Values

ValueMeaning
1Pass
0Fail
0.0-1.0Partial credit

How It Works

#!/bin/bash
mkdir -p /logs/verifier

# Check if the task was completed
if [ -f /home/hello.txt ]; then
    echo 1 > /logs/verifier/reward.txt
else
    echo 0 > /logs/verifier/reward.txt
fi
The verifier writes a reward value to /logs/verifier/reward.txt, which Helios reads to determine success or failure.

Verification Guide

Write robust verification scripts

Web Viewer

The Web Viewer provides real-time observability into agent execution.
helios tasks/my-task --watch
# → Open http://localhost:8080

What You Can See

  • Live VNC view of the desktop (GUI mode)
  • Screenshots at each step
  • Tool calls and their outputs
  • LLM messages and responses
  • Verification results

Web Viewer Guide

Learn how to use the web viewer

Data Flow

Here’s how a task execution flows through Helios:
┌─────────────────────────────────────────────────────────────────────────┐
│ 1. Load Task                                                            │
│    instruction.md + task.toml → Task object                             │
└─────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│ 2. Provision Environment                                                │
│    Docker build (if custom) → Container start → GUI/VNC (if enabled)   │
└─────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│ 3. Agent Loop                                                           │
│    LLM → Tool call → Execute → Screenshot → LLM → ... (until done)     │
└─────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│ 4. Verification                                                         │
│    Run test.sh → Read reward.txt → Report result                       │
└─────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│ 5. Cleanup                                                              │
│    Save trajectory → Stop container → Report final result              │
└─────────────────────────────────────────────────────────────────────────┘

Next Steps