Websites may detect or limit automated access when traffic looks unusual, too frequent, or different from normal browser behavior. Octoparse includes several settings and workflow practices that can help reduce failures, but no anti-blocking method can guarantee access to every site. Use this page to understand why tasks get blocked and which Octoparse features may help.Documentation Index
Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Common blocking signals
Websites may block or challenge extraction tasks based on signals such as:- Too many requests in a short time
- Repeated access from the same IP address
- Missing or unusual browser fingerprints
- Login or cookie inconsistencies
- CAPTCHA challenges
- Region or location mismatch
- Unusual navigation behavior
- Sessions expiring during a run
Common symptoms
| Symptom | Possible cause |
|---|---|
| Task extracts fewer records than expected | Pagination failed, content did not load, or access was limited |
| Page shows CAPTCHA | Website detected suspicious activity |
| Login page appears during extraction | Session expired or cookies were not preserved |
| Cloud run behaves differently from local run | Website reacts differently to the cloud environment |
| Fields become empty | Page structure changed or content was blocked |
| Task stops unexpectedly | Network, blocking, selector, or page load issue |
Anti-blocking options
Proxy settings
Use proxies when the website is sensitive to IP address, location, or request frequency.
Browser fingerprinting
Understand how browser signals may affect website detection.
CAPTCHA handling
Learn what to do when a website shows CAPTCHA during task building or execution.
Auto-login & cookies
Keep session-dependent tasks more stable with login and cookie workflows.
Recommended troubleshooting flow
Run a local test
Watch the task in the built-in browser and identify where blocking or failure appears.
Check whether login or cookies are required
If the site requires an account, confirm that the session is valid before running the task.
Review proxy or location needs
If content depends on region or IP reputation, configure proxy settings where appropriate.
Best practices
- Test with a small sample before scaling.
- Avoid running tasks more frequently than necessary.
- Add wait steps for dynamic or slow-loading pages.
- Keep login sessions and cookies up to date.
- Monitor run logs after changing task settings.
- Respect website terms, robots rules, and applicable laws.