Get batch historical Tweets

Enterprise

Historical PowerTrack overview

This is an enterprise API available within our managed access levels only. To use this API, you must first set up an account with our enterprise sales team. Learn more

Both Historical PowerTrack and Full-Archive Search provide access to any publicly available Tweet. Read our tutorial on ‘Choosing a historical API’ to decide which one best fits your needs.

Historical PowerTrack provides access to the entire historical archive of public Twitter data – back to the first Tweet – using the same rule-based-filtering system as the realtime PowerTrack stream to deliver complete coverage of historical Twitter data.

Accessing data through Historical PowerTrack is accomplished through creating historical ‘jobs’ – a set of PowerTrack filtering rules and a historical time frame for which you would like to retrieve matching data from the Twitter archive. These jobs can be created and managed through the Historical PowerTrack API.

Each Historical PowerTrack job consists of the following steps.

  1. Create a new job for a time frame and set of PowerTrack rules using a request to the Historical PowerTrack API. The API will then sample the time period and generate a ballpark estimate of the expected data volumes, and time required to complete the job.
  2. Either accept or reject the job, based on the estimate generated in Step 1. This is done with another request to the API. If the expected volumes or time required are outside of the expected range you may want to reject the job. If the estimate is acceptable, the customer can accept the job and the job will be run to generate the data files.
  3. When the job is complete, send a request to the API to retrieve the list of URLs that will be used to download the data files. To learn more about the format in which your data comes, please read our Historical Powertrack Technical Details page.
  4. Using the list of URLs obtained in Step 3, download the array of Twitter data files. These files each represent a 10-minute segment of the overall job and are available for download for 15 days after the job has been accepted.

Billing Note: Historical Powertrack jobs are counted for billing based on the day that they are "completed". For example, if a job is accepted and starts running on August 31 and completes on September 1st, that job counts towards the September billing quota.

Rules and filtering

Each Historical PowerTrack job supports up to 1,000 PowerTrack rules, each containing 2,048 characters or less.

Historical PowerTrack supports the same rules and operators as realtime PowerTrack. If you would like to learn more about building a PowerTrack ruleset, please visit the Getting started with Premium Operators guide and List of Premium Operators pages.

Note: Due to how the Twitter platform has evolved since 2006, there are filtering operators that depend on Tweet metadata that were introduced over time. To learn more about these details, please visit our Historical PowerTrack metadata timeline documentation.

Data format

Data for completed Historical PowerTrack jobs is delivered via an array of flat, gzip compressed JSON files. Each file represents a 10-minute segment of the overall time period requested for the job. Learn more about the technical details of the Historical PowerTrack product.

If you would like to learn more about what is included in our Tweet payloads, you can visit our Data Dictionary page. Also, since Historical Powertrack allows you to use up to 1,000 rules, each Tweet payload comes with a Matching Rules enrichment that allows you to identify which rule each tweet matched.

Subscription details

If you have a Historical PowerTrack monthly subscription, your subscription specifies a number of Historical Days that you can use every calendar month. This is the number of days you can request data from as part of your subscription. These Days run from midnight-to-midnight UTC. A Historical Day is accounted for if any minute of a UTC Day is included in the search period.

For example, if your search period begins at 2016-01-01 23:50 UTC and ends at 2016-01-02 00:10 UTC, the request uses up 2 days even though the search period is only 20 minutes. If your subscription includes 300 Historical Days, you could make a single 300-day request, 10 30-day requests, 300 single-day requests, or any other set of Jobs that add up to 300 Historical Days.

Next steps

Are you looking for a one-time Historical Powertrack job?

Request a job by filling out this contact form and select ‘I only need to do a one-time historical data pull’
Read through our tutorial on one-off Historical PowerTrack jobs

Are you interested in a subscription to the Historical PowerTrack product?

Apply for Enterprise access here
Read the Read our Historical PowerTrack API Reference