Octoparse uses a browser-based environment to open websites, interact with page elements, and build extraction workflows. This helps handle websites where data appears after clicks, scrolling, login, or JavaScript loading.Documentation Index
Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Built-in browser
The built-in browser is where you build and test tasks. It lets you open the target website, select elements, create actions, and preview extracted data. You can use it to:- Open target URLs
- Select text, links, images, and other elements
- Click buttons or menu items
- Handle pagination or infinite scroll
- Log in when a task requires session setup
- Test how a page loads during extraction
Selection mode vs Browse Mode
Octoparse uses different interaction modes when building a task.| Mode | Use it for |
|---|---|
| Selection mode | Select page elements and create extraction actions |
| Browse Mode | Interact with the page like a normal browser |
| Workflow testing | Watch the task execute actions and collect sample data |
Dynamic pages
Many websites load data after the initial page load. The built-in browser helps you configure actions for these cases. Common dynamic behaviors include:- Infinite scroll
- “Load more” buttons
- Dropdown filters
- Tabs
- Popups
- JavaScript-rendered content
- Login walls
- Detail pages opened after clicking a list item
Best practices
- Wait until the page is fully loaded before selecting fields.
- Use Browse Mode for manual navigation, then switch back to selection mode.
- Add wait steps when content loads slowly.
- Test a small sample before running the full task.
- Recheck field selection if a website changes its layout.
- Use logs when a task behaves differently during a run.
A browser-based task depends on the structure and behavior of the target website. If the website changes, the task may need to be updated.