Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Octoparse uses a browser-based environment to open websites, interact with page elements, and build extraction workflows. This helps handle websites where data appears after clicks, scrolling, login, or JavaScript loading.

Built-in browser

The built-in browser is where you build and test tasks. It lets you open the target website, select elements, create actions, and preview extracted data. You can use it to:
  • Open target URLs
  • Select text, links, images, and other elements
  • Click buttons or menu items
  • Handle pagination or infinite scroll
  • Log in when a task requires session setup
  • Test how a page loads during extraction

Selection mode vs Browse Mode

Octoparse uses different interaction modes when building a task.
ModeUse it for
Selection modeSelect page elements and create extraction actions
Browse ModeInteract with the page like a normal browser
Workflow testingWatch the task execute actions and collect sample data
Use Browse Mode when you need to manually interact with the page before continuing task setup, such as logging in, closing a popup, opening a menu, or reaching the page state you want to scrape.

Dynamic pages

Many websites load data after the initial page load. The built-in browser helps you configure actions for these cases. Common dynamic behaviors include:
  • Infinite scroll
  • “Load more” buttons
  • Dropdown filters
  • Tabs
  • Popups
  • JavaScript-rendered content
  • Login walls
  • Detail pages opened after clicking a list item
For these pages, the task may need waits, scroll actions, clicks, or pagination steps.

Best practices

  • Wait until the page is fully loaded before selecting fields.
  • Use Browse Mode for manual navigation, then switch back to selection mode.
  • Add wait steps when content loads slowly.
  • Test a small sample before running the full task.
  • Recheck field selection if a website changes its layout.
  • Use logs when a task behaves differently during a run.
A browser-based task depends on the structure and behavior of the target website. If the website changes, the task may need to be updated.