Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Auto-detect helps you create a task faster by letting Octoparse scan a web page, identify repeated data patterns, and generate a starting extraction workflow. It is most useful for pages with structured lists, tables, search results, product grids, directories, or similar repeated items.

When to use Auto-detect

Use Auto-detect when:
  • The page has repeated items such as products, listings, reviews, or search results
  • You want a quick starting workflow
  • You are not sure which elements to select manually
  • You want Octoparse to suggest fields and pagination logic
  • You plan to review and adjust the generated workflow afterward
Auto-detect is a starting point, not a guarantee that every field or action will be correct.

How Auto-detect works

1

Open the target page

Start from the page that contains the data you want to extract.
2

Run Auto-detect

Let Octoparse scan the page and detect repeated data regions.
3

Review detected fields

Check whether the suggested fields match the data you need.
4

Confirm page navigation

Review pagination, scrolling, or next-page actions if they are detected.
5

Test the task

Run a sample and verify the output before scaling the extraction.

What to review after detection

After Auto-detect generates a workflow, check:
AreaWhat to verify
FieldsAre the correct values captured?
Field namesAre column names clear and meaningful?
PaginationDoes the task move to the next page correctly?
Detail pagesDoes the workflow open item details when needed?
DuplicatesAre repeated or unwanted elements included?
Missing valuesAre some rows missing important fields?

When manual editing is needed

Manual adjustments may be needed when:
  • The page layout is irregular
  • Important fields are outside the detected list
  • The website loads content dynamically
  • Pagination is not detected correctly
  • The page requires login, filters, popups, or user interaction
  • You need to extract data from detail pages
Use the no-code builder to refine the generated workflow.
Do not assume Auto-detect output is production-ready. Always review fields, run a sample, and check the exported data.

No-code builder

Adjust or build workflows manually after Auto-detect.

Refine data

Clean and reformat extracted fields before export.