Skip to main content
AI Task nodes can be equipped with a wide range of capabilities. Each capability unlocks additional tools the agent can use during execution. We suggest starting with the default capabilities, and enabling additional ones when strictly needed.

Web Browsing Essentials

Core tools for basic navigation and DOM interaction.

Go to URL

Navigate the browser to any URL

Refresh Page

Reload the current webpage

Basic Web Interaction

Click, type, select elements, and interact via the DOM

Advanced Web Browsing Toolkit

Tools for extraction, evaluation, and page manipulation

Extract HTML

Download or inspect the full HTML of the page

Get Text

Extract visible text as clean markdown

Evaluate JavaScript

Run custom JS in the page context

Zoom Out

Reduce zoom level to view more content

Zoom In

Increase zoom level for readability

Computer Vision

Tools for image-based interaction when DOM access is insufficient.

Computer Vision

Click, locate, and interact using visual recognition

Communication

Send messages or work with email during execution.

Send User Message

Ask the user questions or request clarification

Send Mail

Send emails with custom subject and body

Get Mail

Retrieve inbound emails from the agent’s inbox

File System

Work with files locally or inside the browser session.

List Files

View all files available in the execution context

Read Files

Read text, images, PDFs, or downloaded files

Upload File

Upload files to file input elements on a webpage

Memory & Storage

Store and retrieve execution-scoped data.

Write Scratchpad

Save notes or structured data to memory

Read Scratchpad

Retrieve previously stored information

Read Clipboard

Access the current clipboard contents

Google Sheets

Read and write spreadsheet data.

Sheets: Get Data

Retrieve values from cell ranges like A1:B10

Sheets: Set Value

Update values in specific cells

Authentication

Generate tokens and handle one-time passwords.

Generate TOTP Code

Produce 6-digit MFA codes using stored credentials

Context & Utilities

Access deeper execution context or system utilities.

Query Context

Ask questions about past actions and stored information

Get Datetime

Fetch the current datetime in any timezone