Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Octoparse is a no-code web data extraction platform. It helps you turn website content into structured data by building scraping workflows visually, running them locally or in the cloud, and exporting the results to files, databases, cloud storage, or downstream systems. Instead of writing and maintaining custom scrapers from scratch, you can use Octoparse to define what to collect, how to navigate a website, and where the extracted data should go.

What Octoparse does

Build scraping workflows

Select data on a web page, define actions such as clicks and pagination, and turn the workflow into a reusable task.

Run data extraction

Execute tasks locally or in the cloud, depending on the website, task setup, and automation needs.

Export structured data

Send extracted results to formats and destinations such as CSV, Excel, JSON, Google Sheets, databases, or cloud storage.

How it works

At a high level, an Octoparse workflow moves through three stages:
1

Open the target website

Start from a URL, template, or custom task. Octoparse loads the page in its built-in browser so you can inspect and interact with the site.
2

Define the extraction logic

Select the data fields you want, then add actions such as clicking links, handling pagination, scrolling, logging in, or refining field values.
3

Run and export the task

Run the task locally or in the cloud, monitor progress, then export the collected data to the format or destination you need.

What Octoparse is good for

Octoparse is useful when data is available on websites but not provided in a convenient structured format. Common use cases include:
  • Price monitoring
  • Lead generation
  • Market research
  • Product and review extraction
  • Directory and listing collection
  • Real estate, job, ecommerce, and social data collection
  • Recurring data collection for operations or reporting workflows
Octoparse is especially helpful when the target website requires browser interaction, such as clicking buttons, scrolling pages, opening detail pages, or extracting data from dynamic content.

What Octoparse is not

Octoparse is not a general-purpose analytics, BI, or database platform. It helps collect and structure web data, but downstream analysis, modeling, and reporting usually happen in other tools. It is also not the same as a hosted scraping API where you only send a URL and receive a standardized response. Octoparse tasks are workflows: you define how the website should be opened, navigated, extracted, cleaned, and exported.
Octoparse isOctoparse is not
A no-code web data extraction platformA BI dashboarding tool
A visual workflow builder for scraping tasksA generic RPA platform for all desktop actions
A way to run and automate extraction tasksA one-size-fits-all hosted scraping API
A tool for producing structured web dataA replacement for data analysis or reporting tools

Browser-based extraction

Octoparse includes a built-in browser for interacting with websites while building tasks. In normal task-building mode, clicks are used to select elements and create actions. When manual interaction is needed, such as logging in, closing popups, or solving a challenge before continuing setup, Browse Mode lets the built-in browser behave more like a regular browser. This browser-based approach is useful for websites where the data is not available in static HTML or where extraction depends on page interaction.
The exact behavior of a task depends on the website structure, task settings, extraction mode, and whether the task runs locally or in the cloud.

Where to go next

Understand the core concepts

Learn the main building blocks behind tasks, fields, actions, runs, and exports.

Build tasks without code

Learn how Octoparse lets you create scraping workflows visually.

Compare local and cloud extraction

Understand when to run tasks locally and when to use cloud extraction.

Explore export options

See how extracted data can be exported for downstream use.