Use the following filters to focus only on the data you need.
Escaping reserved characters
If you need to use any of the characters which function as operators in your query itself (and not as operators), then you should escape them with a leading backslash. For instance, to search for external_links:https://www.linkedin.com, you would need to write your query as external_links:https\:\/\/www.linkedin.com
The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
Failing to escape these special characters correctly could lead to a syntax error which prevents your query from running.
language
The language of the post. The default is any.
Find posts in French or Italian:
(language:french OR language:italian)
author
Return posts written by a specific author
Find posts written by Admin:
author:Admin
text
A textual Boolean query describing the keywords that should (or shouldn’t) appear in the post text.
text:(apple OR android)
has_video
A Boolean parameter that specifies if to search only for posts that contain a video.
has_video:true
external_links
Search for posts that included links to another site.
Search for posts that linked to LinkedIn (note that both the slashes and colons are prefixed by a backslash):
external_links:https\:\/\/www.linkedin.com*
is_first
A Boolean parameter that specifies if to search only on the first post (exclude the comments)
is_first:true
rating
For review posts, the rating parameter provides the star rating for the review, a floating number between 0.0 to 5.0.
Return all the posts with rating greater than 0:
rating:>0
published
A time-stamp (in milliseconds) enable you to filter posts that were published before or after certain date/time.
Return posts published after Thu, 30 Mar 2017 09:16:28 GMT:
published:>1490865388000
site_type
What type of sites to search in (the default is any). Available Types:
- news
- blogs
- discussions
Only news: site_type:news
News & Blogs: (site_type:news OR site_type:blogs)
site
Limit the results to a specific site or sites.
Limit the results to posts from Yahoo or CNN:
(site:yahoo.com OR site:cnn.com)
thread.country
Webhose.io uses heuristics to determine the country origin of a site, by taking into account the site's IP, TLD and language. Many times the country origin isn't conclusive so it isn't set, therefor filtering by country may result in much less data than when filtering by language.
Return posts from sites from Hong Kong:
thread.country:HK
To get the full country code list, visit countrycode.org.
site_suffix
Limit the results to a specific site suffix
Return posts from sites where their top level domain (TLD) ends with .fr:
site_suffix:fr
site_full
Filter sites based on the domain and optionally by their sub-domain
Return posts from Yahoo answers:
site_full:answers.yahoo.com
site_category
Limit the results to posts originating from sites categorized as one (or more) of the following.
Return posts from sites categorized as sports or games related:
(site_category:sports OR site_category:games)
domain_rank
A rank that specifies how popular a domain is (by monthly traffic)
Search for posts from the top 1,000 sites:
domain_rank:<1000
A thread contains global information about the content of the whole page and its content. A thread can contain multiple posts grouped together.
thread.title
A textual Boolean query describing the keywords that should (or shouldn’t) appear in the thread title.
Search for posts containing the word "glass" and not "metal" in their title:
thread.title:glass -thread.title:metal
thread.section_title
A textual Boolean query describing the keywords that should (or shouldn’t) appear in the site’s section where the post was published
Search for the posts containing the word food only under sections with a title that contains the word "restaurants":
food AND thread.section_title:restaurants
thread.url
Get all the posts of a specific thread (note that you must escape the http:// part of the URL like so: http\:\/\/).
spam_score
A score value between 0 to 1, indicating how spammy the thread text is.
Return threads with spam score lower or equals to 0.8:
spam_score:<=0.8
thread.published
A time-stamp (in milliseconds) enable you to filter threads that were published before or after certain date/time.
Return threads published after Thu, 30 Mar 2017 09:16:28 GMT: thread.published:>
1490865388000
crawled
A time-stamp (in milliseconds) enable you to filter posts that were crawled before or after certain date/time.
Return posts crawled after Thu, 30 Mar 2017 09:16:28 GMT:
crawled:>1490865388000
performance_score
A virality score for news and blogs posts only. The score ranges between 0-10, where 0 means that the post didn't do well at all, i.e rarely or was never shared, to 10 which means that the post was on fire, shared thousands of times on Facebook.
Search for news or blog posts with performance score higher than 8 (highly viral):
apple performance_score:>8
social.facebook.likes
Return posts filtered by the number of Facebook likes.
Return posts with more than 10 Facebook likes:
social.facebook.likes:>10
social.facebook.shares
Return posts filtered by the number of Facebook shares.
Return posts with more than 10 Facebook shares:
social.facebook.shares:>10
social.facebook.comments
Return posts filtered by the number of Facebook comments.
Return posts with more than 10 Facebook comments:
social.facebook.comments:>10
social.pinterest.shares
Return posts filtered by the number of Pinterest shares.
Return posts with more than 10 Pinterest shares:
social.pinterest.shares:>10
social.stumbledupon.shares
Return posts filtered by the number of Stumbledupon shares.
Return posts with more than 10 Stumbledupon shares:
social.stumbledupon.shares:>10
social.vk.shares
Return posts filtered by the number of VK shares.
Return posts with more than 10 VK shares:
social.vk.shares:>10
We extract entities such as Persons, Organizations and Locations from all the English news and blog posts we crawl. We detect the sentiment attached to Persons and Organizations (not Locations) from the top news outlets.
person
Filter by person name. You should use this filter only for disambiguation, otherwise you should use a simple keyword search.
person:"barack obama"
organization
Filter by organization/company name. You should use this filter only for disambiguation, otherwise you should use a simple keyword search.
organization:"apple"
location
Filter by location name
Important: Don't confuse this with the country filter. If you want to search for sites from a specific country, use the thread.country parameter (explained above).
location:"germany"
[entity].[sentiment]
Find an entity with a sentiment context attached to it.
person.positive:"obama"
organization.negative:"apple"
organization.neutral:"google"