Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.octoparse.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Upload scraped data directly to an Amazon S3 bucket. Multiple file formats are supported.

Prerequisites

  • Octoparse Standard plan or above
  • An AWS account with the target S3 bucket already created
  • An IAM user with s3:PutObject permission on the target bucket

Configure IAM permissions

Octoparse uses an Access Key to write to S3 — create a dedicated IAM user and grant only the minimum required permissions. Minimum-permission policy example:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::your-bucket-name/*"
    }
  ]
}
After creating the IAM user, generate an Access Key in Security credentials and save the Access Key ID and Secret Access Key.

Configure the export

Fill in the export settings in the following order: Configuration name Give the config a name so you can reuse it later. The dropdown lets you pick an existing config or type a new name to create one. Connection details
FieldDescription
AuthenticationAuthentication method — currently only Access key
Access key IDThe IAM user’s Access Key ID
Secret access keyThe IAM user’s Secret Access Key
Service areaAWS region (pick from the dropdown)
BucketTarget bucket name
Security: the Access Key has write permission on the target bucket. Don’t share it or commit it to a code repository, and rotate it periodically.
Export data as Pick the upload file format: Excel, CSV, HTML, JSON, or Xml. File naming settings
OptionDescription
Same as task nameFilename matches the task name; repeated exports overwrite
Append current timestamp to task nameEach export creates a new file, keeping history
Use a custom nameSpecify the filename manually
When a file with the same name already exists
OptionDescription
Create a new file with current timestampKeeps the original; new file gets a timestamp suffix
Replace the existing fileOverwrites the original
Append new data to the existing fileAppends to the existing file (CSV / Excel only)
Click Confirm to save the config and start the export.

Common errors

ErrorCauseFix
InvalidKeySecretWrong Access Key ID or Secret Access KeyRe-verify credentials
NoSuchBucketBucket name wrong or doesn’t existConfirm the bucket name
PermissionDeniedInsufficient IAM permissionsCheck the IAM policy includes s3:PutObject
IllegalLocationBucket region doesn’t match Service areaSelect the bucket’s actual region
FormatErrorConfiguration malformedVerify every field is filled in