Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Most functional commands require authentication unless noted. Use --json on supported commands when you need machine-readable output. Angle brackets such as <taskId> and <apiKey> indicate required placeholders. Replace them with your actual task ID or API key.

Help and diagnostics

Use these commands to check the installed CLI version, inspect available capabilities, and diagnose the local runtime.
octoparse --help
octoparse --version
octoparse capabilities
octoparse doctor
octoparse browser doctor
Use JSON output when you need structured diagnostics:
octoparse capabilities --json
octoparse doctor --json
octoparse browser doctor --json

Authentication

Use these commands to log in, check the active session, or remove stored credentials.
octoparse auth login
octoparse auth login <apiKey>
octoparse auth login --stdin
octoparse auth login --no-open
octoparse auth status
octoparse auth logout
After login, run octoparse auth status to confirm your session is active.

Tasks

Use these commands to list tasks, inspect task definitions, and validate whether a task can be used by the CLI.
octoparse task list
octoparse task list --page 2 --page-size 20
octoparse task list --limit 10
octoparse task list --keyword news
octoparse task inspect <taskId>
octoparse task validate <taskId>
You can also provide a task file:
octoparse task inspect <taskId> --task-file <file.json|file.xml|file.otd>
octoparse task validate <taskId> --task-file <file.json|file.xml|file.otd>

Local extraction

Use these commands to run a task with the local embedded engine.
octoparse run <taskId>
octoparse run <taskId> --headless
octoparse run <taskId> --max-rows 100
octoparse run <taskId> --detach
octoparse run <taskId> --output ./runs
After detaching a local run, use octoparse local status, octoparse local pause, octoparse local resume, or octoparse local stop to manage it.
Use a custom Chrome path if needed:
octoparse run <taskId> --chrome-path /path/to/chrome

Cloud extraction

Use these commands to start, stop, monitor, and review task runs on Octoparse cloud servers.
octoparse cloud start <taskId>
octoparse cloud stop <taskId>
octoparse cloud status <taskId>
octoparse cloud history <taskId>

Local run control

Use these commands to check, pause, resume, stop, or clean up local run state.
octoparse local status <taskId>
octoparse local pause <taskId>
octoparse local resume <taskId>
octoparse local stop <taskId>
octoparse local history <taskId>
octoparse local export <taskId> --format xlsx
octoparse local cleanup
octoparse local cleanup removes stale local run state and temporary files. It does not stop currently running tasks.

Data history and export

Use these commands to inspect collected data history and export local or cloud results.
octoparse data history <taskId> --source local
octoparse data history <taskId> --source cloud
octoparse data export <taskId> --source local --format xlsx
octoparse data export <taskId> --source cloud --format csv
Supported export formats:
xlsx
csv
html
json
xml

Task file format

A task file can use this structure:
{
  "taskId": "abc123",
  "taskName": "Example",
  "xml": "... original OTD XML ...",
  "xoml": "... transformed BPMN XOML ...",
  "fieldNames": ["title", "url"],
  "workflowSetting": {},
  "brokerSettings": {},
  "userAgent": "Mozilla/5.0 ...",
  "disableAD": false
}
Key fields:
FieldMeaning
taskIdOctoparse task ID
taskNameHuman-readable task name
xmlOriginal OTD XML definition
xomlTransformed workflow definition used by the engine
fieldNamesOutput field names expected from the task
workflowSettingTask workflow settings
brokerSettingsRuntime or broker-related task settings
userAgentBrowser user agent used during extraction
disableADWhether ad blocking is disabled
API key authentication is required for functional commands, including local --task-file and .otd runs.