Streaming API request parameters

Updated on Thu, 2013-10-17 16:26

Use the following request parameters to define what data is returned by the Streaming API endpoints:

delimited

This parameter may be used on all streaming endpoints, unless explicitly noted.

Setting this to the string length indicates that statuses should be delimited in the stream, so that clients know how many bytes to read before the end of the status message. Statuses are represented by a length, in bytes, a newline, and the status text that is exactly length bytes. Note that “keep-alive” newlines may be inserted before each length.

As an example, consider this response to a request to https://stream.twitter.com/1.1/statuses/filter.json?delimited=length&track;=twitterapi:

HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked

1953
{"retweet_count":0,"text":"Man I like me some @twitterapi","entities":{"urls":[],"hashtags":[],"user_mentions":[{"indices":[19,30],"name":"Twitter API","id":6253282,"screen_name":"twitterapi","id_str":"6253282"}]},"retweeted":false,"in_reply_to_status_id_str":null,"place":null,"in_reply_to_user_id_str":null,"coordinates":null,"source":"web","in_reply_to_screen_name":null,"in_reply_to_user_id":null,"in_reply_to_status_id":null,"favorited":false,"contributors":null,"geo":null,"truncated":false,"created_at":"Wed Feb 29 19:42:02 +0000 2012","user":{"is_translator":false,"follow_request_sent":null,"statuses_count":142,"profile_background_color":"C0DEED","default_profile":false,"lang":"en","notifications":null,"profile_background_tile":true,"location":"","profile_sidebar_fill_color":"ffffff","followers_count":8,"profile_image_url":"http:\/\/a1.twimg.com\/profile_images\/1540298033\/phatkicks_normal.jpg","contributors_enabled":false,"profile_background_image_url_https":"https:\/\/si0.twimg.com\/profile_background_images\/365782739\/doof.jpg","description":"I am just a testing account, following me probably won't gain you very much","following":null,"profile_sidebar_border_color":"C0DEED","profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1540298033\/phatkicks_normal.jpg","default_profile_image":false,"show_all_inline_media":false,"verified":false,"profile_use_background_image":true,"favourites_count":1,"friends_count":5,"profile_text_color":"333333","protected":false,"profile_background_image_url":"http:\/\/a3.twimg.com\/profile_background_images\/365782739\/doof.jpg","time_zone":"Pacific Time (US & Canada)","created_at":"Fri Sep 09 16:13:20 +0000 2011","name":"fakekurrik","geo_enabled":true,"profile_link_color":"0084B4","url":"http:\/\/blog.roomanna.com","id":370773112,"id_str":"370773112","listed_count":0,"utc_offset":-28800,"screen_name":"fakekurrik"},"id":174942523154894848,"id_str":"174942523154894848"}

The 1953 indicates how many bytes to read off of the stream to get the rest of the Tweet (including \r\n). The next length delimiter will occur exactly after 1953 bytes.

stall_warnings

This parameter may be used on all streaming endpoints, unless explicitly noted.

Setting this parameter to the string true will cause periodic messages to be delivered if the client is in danger of being disconnected. These messages are only sent when the client is falling behind, and will occur at a maximum rate of about once every 5 minutes. This parameter is most appropriate for clients with high-bandwidth connections, such as the firehose.

Such warning messages will look like:

  1. {
  2.   "warning":{
  3.     "code":"FALLING_BEHIND",
  4.     "message":"Your connection is falling behind and messages are being queued for delivery to you. Your queue is now over 60% full. You will be disconnected when the queue is full.",
  5.     "percent_full": 60
  6.   }
  7. }

filter_level

This parameter may be used on all streaming endpoints, unless explicitly noted.

Setting this parameter to one of none, low, or medium will set the minimum value of the filter_level Tweet attribute required to be included in the stream. The default value is none, which includes all available Tweets.

When displaying a stream of Tweets to end users (dashboards or live feeds at a presentation or conference, for example) it is suggested that you set this value to medium.

See Introducing new metadata for Tweets for more information.

language

This parameter may be used on all streaming endpoints, unless explicitly noted.

Setting this parameter to a comma-separated list of BCP 47 language identifiers corresponding to any of the languages listed on Twitter's advanced search page will only return Tweets that have been detected as being written in the specified languages. For example, connecting with language=en will only stream Tweets detected to be in the English language.

follow

Available on POST statuses/filter.

A comma-separated list of user IDs, indicating the users whose Tweets should be delivered on the stream. Following protected users is not supported. For each user specified, the stream will contain:

  • Tweets created by the user.
  • Tweets which are retweeted by the user.
  • Replies to any Tweet created by the user.
  • Retweets of any Tweet created by the user.
  • Manual replies, created without pressing a reply button (e.g. “@twitterapi I agree”).

The stream will not contain:

  • Tweets mentioning the user (e.g. “Hello @twitterapi!”).
  • Manual Retweets created without pressing a Retweet button (e.g. “RT @twitterapi The API is great”).
  • Tweets by protected users.

track

Available on POST statuses/filter.

A comma-separated list of phrases which will be used to determine what Tweets will be delivered on the stream. A phrase may be one or more terms separated by spaces, and a phrase will match if all of the terms in the phrase are present in the Tweet, regardless of order and ignoring case. By this model, you can think of commas as logical ORs, while spaces are equivalent to logical ANDs (e.g. ‘the twitter’ is the AND twitter, and ‘the,twitter’ is the OR twitter).

The text of the Tweet and some entity fields are considered for matches. Specifically, the text attribute of the Tweet, expanded_url and display_url for links and media, text for hashtags, and screen_name for user mentions are checked for matches.

Each phrase must be between 1 and 60 bytes, inclusive.

Exact matching of phrases (equivalent to quoted phrases in most search engines) is not supported.

Punctuation and special characters will be considered part of the term they are adjacent to. In this sense, "hello." is a different track term than "hello". However, matches will ignore punctuation present in the Tweet. So "hello" will match both "hello world" and "my brother says hello." Note that punctuation is not considered to be part of a #hashtag or @mention, so a track term containing punctuation will not match either #hashtags or @mentions.

UTF-8 characters will match exactly, even in cases where an "equivalent" ASCII character exists. For example, "touché" will not match a Tweet containing "touche".

Non-space separated languages, such as CJK are currently unsupported.

URLs are considered words for the purposes of matches which means that the entire domain and path must be included in the track query for a Tweet containing an URL to match. Note that display_url does not contain a protocol, so this is not required to perform a match.

Twitter currently canonicalizes the domain "www.example.com" to "example.com" before the match is performed, so omit the "www" from URL track terms.

Finally, to address a common use case where you may want to track all mentions of a particular domain name (i.e., regardless of subdomain or path), you should use "example com" as the track parameter for "example.com" (notice the lack of period between "example" and "com" in the track parameter). This will be over-inclusive, so make sure to do additional pattern-matching in your code. See the table below for more examples related to this issue.

Track examples:

Parameter valueWill match...Will not match...
Twitter TWITTER
twitter
"Twitter"
twitter.
#twitter
@twitter
http://twitter.com
TwitterTracker
#newtwitter
Twitter's I like Twitter's new design Someday I'd like to visit @Twitter's office
twitter api,twitter streaming The Twitter API is awesome
The twitter streaming service is fast
Twitter has a streaming API
I'm new to Twitter
example.com Someday I will visit example.com There is no example.com/foobarbaz
example.com/foobarbaz example.com/foobarbaz
www.example.com/foobarbaz
example.com
www.example.com/foobarbaz www.example.com/foobarbaz
example com example.com
www.example.com
foo.example.com
foo.example.com/bar
I hope my startup isn't merely another example of a dot com boom!
 

To have a better feeling of the keywords that match the content of a tweet, you can try interactively on the Streaming API keyword matching page.

locations

Available on POST statuses/filter.

A comma-separated list of longitude,latitude pairs specifying a set of bounding boxes to filter Tweets by. On geolocated Tweets falling within the requested bounding boxes will be included—unlike the Search API, the user’s location field is not used to filter tweets.

Each bounding box should be specified as a pair of longitude and latitude pairs, with the southwest corner of the bounding box coming first. For example:

Parameter value Tracks Tweets from...
-122.75,36.8,-121.75,37.8 San Francisco
-74,40,-73,41 New York City
-122.75,36.8,-121.75,37.8,-74,40,-73,41 San Francisco OR New York City
-180,-90,180,90 Any geotagged Tweet

Bounding boxes do not act as filters for other filter parameters. For example track=twitter&locations=-122.75,36.8,-121.75,37.8 would match any tweets containing the term Twitter (even non-geo tweets) OR coming from the San Francisco area.

The streaming API uses the following heuristic to determine whether a given Tweet falls within a bounding box:

  • If the coordinates field is populated, the values there will be tested against the bounding box. Note that this field uses geoJSON order (longitude, latitude).
  • If coordinates is empty but place is populated, the region defined in place is checked for intersection against the locations bounding box. Any overlap will match.
  • If none of the rules listed above match, the Tweet does not match the location query. Note that the geo field is deprecated, and ignored by the streaming API.

If you would like to exclude place matches or only include places which fall completely within the bounding box, your code will have to perform an additional filtering step after reading the filtered stream.

Note that native Retweets are not matched by this parameter. While the original Tweet may have a location, the Retweet will not.

count

This parameter requires elevated access to use.

When reconnecting to a streaming endpoint, elevated access clients may include the count parameter to attempt to backfill missed messages which occurred during the disconnect period. The supplied value may be an integer from 1 to 150000 or from -1 to -150000. If a positive number is specified, the stream will transition to live values once the backfill has been delivered to the client. If a negative number is specified, the stream will disconnect once the backfill has been delivered to the client, which may be useful for debugging.

Note that use of this parameter should be carefully considered, as high values increase the chance of a subsequent disconnect. To demonstrate this, consider the case where a client connects without backfill. Upon establishing a connection, Twitter will allocate a fixed-size queue, and begin adding messages to be streamed to the client. If the client reads too slowly, the queue will fill up. Once full, Twitter will disconnect the client:

Image showing a queue filling up and a disconnect when full

When a client connects with backfill, that number of messages are immediately added to the queue. The client must read messages faster than the current rate of Tweets being added to the queue, as the available buffer before a disconnect occurs can be much smaller than when connecting without backfill.

Image showing a smaller queue after backfill is requested

with

Available on GET user and GET site.

The with parameter controls the types of messages delivered to User and Site Streams clients.

  • The default for Site Streams is with=user, which only streams messages from the user associated with the stream.
  • The default for User Streams is with=followings which adds messages from accounts the user follows, equivalent to the user's home timeline.
Despite the difference in defaults, Site and User each accept both user and followings parameter values.

replies

Available on GET user and GET site.

By default @replies are only sent if the current user follows both the sender and receiver of the reply. For example, consider the case where Alice follows Bob, but Alice doesn’t follow Carol. By default, if Bob @replies Carol, Alice does not see the Tweet. This mimics twitter.com and api.twitter.com behavior. To have such Tweets returned in a streaming connection, specify replies=all when connecting.

stringify_friend_ids

Available on GET user and GET site.

By default, user and site streams send the friends list preamble as an array of integers (equivalent to stringify_friend_ids=false). However, as the number of Twitter users grows, user ids are quickly approaching the 32-bit integer threshold, which when passed, will require your code to handle 64-bit integers. Some languages or libraries (including JSON decoders) expect that integers provided in JSON are 32-bit and will therefore have erroneous and potentially unpredictable behavior. If natively interpreting integers as 64-bit poses a challenge for you, we offer the stringify_friend_ids=true parameter to have the friends list preamble be an array of strings (instead of integers). If you use this parameter, note that it will suppress the friends array and return the friends_str array in its place. See the friends list message type entry for an example payload.