Some data fields & filters were added along the way and don't exist before certain dates.
- Entities didn't exist before March 2015.
- performance_score didn't exist before May 2015.
- domain_rank didn't exist before April 2016.
- Sentiment didn't exist before June 2016.
- has_video was added on July 2016 and didn't exist before.
- Social Signals & Site Categories didn't exist before August 2016.
- rating didn't exist before February 2017.
Having access to live web data is important, but sometimes you may also require historical data to either fill up the gaps, or create a benchmark for your data. Webhose.io makes it easy to access historical data going back to December 2014 via our user-friendly online archive.
In this step you need to define the filters that will bring in the relevant posts you need. You can do that by writing a Boolean query like you would do with the live data.
Once your query is ready, click on the "Run" button to preview the results. Go over them and make sure they are relevant, since the same query will be used to filter data from the archive.
Now that you have your query ready, you can define the start month/year and the end month/year you want the data from. Note that as you change the timeframe, the estimated number of posts changes and so it the price.
The estimated total number of posts is calculated by querying 30 day live data and multiplying the results count by the number of months the request period spans.
The cost is a function of the greater value when comparing the $10 minimum cost per month, to the estimated total number of posts multiplied by $0.0002 US. The cost will be charged to your account regardless of the actual result count, including an empty dataset.
Once you approve the payment details, our system will start working on retrieving the data. It may take a few minutes, to a few hours to retrieve all the data, so once the job is complete, we will send you an email with a link to the data.
You can also check the dashboard for your order status.
|Accessing the Archive via an API|