This page explains the core concepts used across Octoparse. Understanding these terms makes it easier to build tasks, troubleshoot extraction issues, and connect results to downstream tools.Documentation Index
Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Task
A task is a reusable extraction workflow. It contains the target website, the steps Octoparse should perform, the fields to extract, and the run/export settings. A task may include:- Opening one or more URLs
- Clicking buttons or links
- Looping through lists
- Handling pagination
- Opening detail pages
- Extracting fields
- Cleaning field values
- Running locally or in the cloud
Template
A template is a prebuilt task for a common website or use case. Templates help you start faster because the extraction workflow and fields are already configured. Use templates when:- A matching website template is available
- You need a faster setup path
- You want a standard structure for common data types
- You do not need heavy customization
Workflow actions
Workflow actions define how Octoparse moves through a website. Common actions include:| Action | What it does |
|---|---|
| Open page | Loads a target URL |
| Click | Clicks a button, link, menu item, or page element |
| Loop | Repeats an action across multiple items |
| Pagination | Moves through multiple result pages |
| Scroll | Loads more content on pages with infinite scroll or lazy loading |
| Wait | Gives dynamic content time to load |
| Extract data | Captures values from selected elements |
Field
A field is a column in your extracted data. Examples include product name, price, rating, URL, date, company name, address, or review text. Fields should be named clearly so exported data is easy to understand. Good field names are specific:| Less clear | Better |
|---|---|
| Text 1 | Product name |
| Field 2 | Price |
| Link | Product URL |
| Date | Review date |