Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

CAPTCHA is a verification challenge used by websites to distinguish normal users from automated behavior. If a website shows CAPTCHA during task building or extraction, the task may pause, fail, or return incomplete data. CAPTCHA usually indicates that the website has detected unusual activity, strict access controls, or a sensitive workflow such as login, search, or repeated page loading.

Common CAPTCHA situations

CAPTCHA may appear when:
  • A task sends too many requests too quickly
  • The website detects repeated access from the same IP
  • Login or session behavior looks unusual
  • The task runs from a cloud environment the site does not trust
  • The website has strong anti-bot protection
  • A proxy or IP address has poor reputation

What to do first

1

Run a local test

Check whether CAPTCHA appears during task building, local runs, or only cloud runs.
2

Slow down the workflow

Add waits, reduce frequency, and avoid unnecessary repeated actions.
3

Check login and cookies

If the website requires login, confirm that the session is valid and stable.
4

Review proxy needs

If IP reputation or location is the issue, test an appropriate proxy setup.
5

Monitor logs

Check where CAPTCHA appears and whether the task can continue afterward.

Possible approaches

ApproachWhen it helps
Add waitsThe website reacts to fast or repeated actions
Reduce schedule frequencyCAPTCHA appears after too many runs
Use stable cookiesThe site requires session continuity
Configure proxyIP reputation or region affects access
Run locallyThe site blocks cloud environments
Manually verify during setupCAPTCHA appears before reaching the target page

Limits

CAPTCHA is designed to prevent automated access. Some sites may not be suitable for automated extraction if CAPTCHA appears consistently or blocks the required workflow.
Do not attempt to bypass security controls in a way that violates website terms or applicable laws. If CAPTCHA prevents access to data you are not permitted to collect, stop the task.

Proxy

Use proxy settings when IP or location affects task access.

Auto-login & cookies

Improve session stability for login-required websites.