CAPTCHA is a verification challenge used by websites to distinguish normal users from automated behavior. If a website shows CAPTCHA during task building or extraction, the task may pause, fail, or return incomplete data. CAPTCHA usually indicates that the website has detected unusual activity, strict access controls, or a sensitive workflow such as login, search, or repeated page loading.Documentation Index
Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Common CAPTCHA situations
CAPTCHA may appear when:- A task sends too many requests too quickly
- The website detects repeated access from the same IP
- Login or session behavior looks unusual
- The task runs from a cloud environment the site does not trust
- The website has strong anti-bot protection
- A proxy or IP address has poor reputation
What to do first
Run a local test
Check whether CAPTCHA appears during task building, local runs, or only cloud runs.
Check login and cookies
If the website requires login, confirm that the session is valid and stable.
Possible approaches
| Approach | When it helps |
|---|---|
| Add waits | The website reacts to fast or repeated actions |
| Reduce schedule frequency | CAPTCHA appears after too many runs |
| Use stable cookies | The site requires session continuity |
| Configure proxy | IP reputation or region affects access |
| Run locally | The site blocks cloud environments |
| Manually verify during setup | CAPTCHA appears before reaching the target page |
Limits
CAPTCHA is designed to prevent automated access. Some sites may not be suitable for automated extraction if CAPTCHA appears consistently or blocks the required workflow.Related pages
Proxy
Use proxy settings when IP or location affects task access.
Auto-login & cookies
Improve session stability for login-required websites.