Skip to main content
This guide walks you through creating tasks for Helios. A task is a self-contained directory that defines what an agent should accomplish and how to verify success.

Task Structure

Every task is a directory with the following structure:
my-task/
├── instruction.md      # What the agent should do
├── task.toml           # Configuration
├── environment/        # Optional: custom Docker build
│   └── Dockerfile
└── tests/              # Verification scripts
    └── test.sh
Only instruction.md and task.toml are required. The environment/ and tests/ directories are optional but recommended.

Quick Start: Create Your First Task

1

Create the task directory

mkdir -p tasks/my-first-task/tests
2

Write the instruction

Create tasks/my-first-task/instruction.md:
Create a file called greeting.txt in /home with the content "Hello from Helios!"
3

Create the configuration

Create tasks/my-first-task/task.toml:
version = "1.0"

[metadata]
author_name = "Your Name"
author_email = "you@example.com"
difficulty = "easy"
category = "file-creation"
tags = ["beginner", "files"]

[verifier]
timeout_sec = 120.0

[agent]
timeout_sec = 120.0

[environment]
docker_image = "ubuntu:22.04"
gui = false
cpus = 1
memory_mb = 2048
4

Add verification

Create tasks/my-first-task/tests/test.sh:
#!/bin/bash
mkdir -p /logs/verifier

if [ -f /home/greeting.txt ] && grep -q "Hello from Helios" /home/greeting.txt; then
    echo "PASS: File exists with correct content"
    echo 1 > /logs/verifier/reward.txt
else
    echo "FAIL: File missing or content incorrect"
    echo 0 > /logs/verifier/reward.txt
fi
5

Run the task

helios tasks/my-first-task --watch

Task Types

Headless Tasks (CLI/Terminal)

For tasks that don’t require a graphical interface:
[environment]
docker_image = "ubuntu:22.04"
gui = false
Example use cases:
  • Create files and directories
  • Install and configure software
  • Run scripts and process data
  • Interact with APIs

GUI Tasks (Desktop)

For tasks requiring mouse and keyboard interaction:
1

Build the desktop image (once)

docker build -t cua-desktop -f docker/Dockerfile.desktop .
2

Configure the task

[environment]
docker_image = "cua-desktop"
gui = true
cpus = 2
memory_mb = 4096
Example use cases:
  • Browser automation
  • Desktop application interaction
  • Form filling
  • Visual testing

PDF Form Tasks

For PDF form-filling tasks (part of PDFBench):
1

Build the PDFBench base image (once)

docker build -t pdfbench-base -f docker/Dockerfile.pdfbench .
2

Configure the task

[environment]
docker_image = "pdfbench-base"
gui = true
cpus = 2
memory_mb = 4096

Custom Dockerfiles

For tasks requiring specific software, create a custom Dockerfile:
my-task/
├── instruction.md
├── task.toml
└── environment/
    └── Dockerfile

Example: Python Environment

environment/Dockerfile:
FROM python:3.12-slim

RUN pip install requests pandas numpy

WORKDIR /home
task.toml:
[environment]
docker_image = ""  # Leave empty to use custom Dockerfile
gui = false

Example: Node.js Environment

environment/Dockerfile:
FROM node:20-slim

RUN npm install -g typescript ts-node

WORKDIR /home

Example: GUI with Additional Tools

environment/Dockerfile:
FROM cua-desktop

RUN apt-get update && apt-get install -y \
    gimp \
    inkscape \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /home

Writing Good Instructions

Be Specific and Measurable

Make a website

Include Success Criteria

Set up a database

Specify File Paths

Save the output

Example Tasks

Example 1: File Creation (Easy)

Create a file at /home/hello.txt containing exactly "Hello World"

Example 2: Web Scraping (Medium)

Write a Python script at /home/scraper.py that:
1. Fetches https://example.com
2. Extracts the page title
3. Saves it to /home/title.txt

Example 3: GUI Browser Task (Hard)

Open Firefox, navigate to https://example.com, and take a screenshot.
Save the screenshot as /home/screenshot.png

Troubleshooting

Ensure your task directory contains both instruction.md and task.toml.
  1. Test your test.sh script manually inside a container
  2. Check that /logs/verifier/ directory is created
  3. Verify the reward file path is exactly /logs/verifier/reward.txt
  1. Check your Dockerfile syntax
  2. Ensure base images exist (docker pull ubuntu:22.04)
  3. Increase build_timeout_sec for complex builds
Increase timeout_sec in the [agent] section, or simplify the task.
  1. Ensure gui = true in task.toml
  2. Build the desktop image: docker build -t cua-desktop -f docker/Dockerfile.desktop .
  3. Use --watch to see the live view

Next Steps