Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Schedules let Octoparse run tasks automatically at defined times or intervals. They are useful for recurring data collection workflows such as price monitoring, lead list updates, inventory checks, or routine reporting. Scheduled runs are typically used with cloud extraction so tasks can run without depending on your local computer.

When to use schedules

Use schedules when:
  • The same data needs to be collected repeatedly
  • You need daily, weekly, or hourly updates
  • The task should run outside working hours
  • Data needs to be ready before a report or downstream workflow
  • You want to reduce manual task starts

Scheduling workflow

1

Build and test the task

Confirm the task extracts the correct fields and handles pagination or detail pages correctly.
2

Choose the run environment

Use cloud extraction for unattended scheduled runs.
3

Set the schedule

Choose when and how often the task should run.
4

Configure export

Set up manual or automatic export depending on where the data should go.
5

Monitor results

Check run history, logs, and exported records after scheduled runs.

What to check before scheduling

Before enabling a schedule, check:
CheckWhy it matters
Task stabilityScheduled runs repeat the same workflow automatically
Login/session setupExpired sessions can cause failed runs
Cloud compatibilityScheduled runs usually depend on cloud execution
Export settingsData may need to be delivered automatically
Website rate limitsFrequent runs can increase blocking risk
Run timeLong tasks may overlap if scheduled too often
  • Test the task manually before scheduling it.
  • Start with a conservative frequency.
  • Monitor the first few scheduled runs.
  • Use logs to diagnose failures.
  • Review exported records for missing fields or duplicates.
  • Adjust frequency if the website updates less often than expected.
Avoid scheduling tasks more frequently than needed. Excessive runs can create duplicate data, increase usage, and raise the chance of website blocking.