See also Entities from the Field Guide.
Why Tweet Entities?
Tweet text can potentially mention other users or lists, but also contain URLs, media, hashtags... Instead of parsing the text yourself to try to extract those entities, you can use the entities attribute that contains this parsed and structured data.
How can I use these Tweet Entities?
As usual, it is important to be tolerant of new fields and empty/null values in all returns. Note also that:
- With the REST API v1, you'll need to set the include_entities parameter to 1 (or true) if you want entities to be included. In API v1.1, entities will always be included unless you set include_entities to False or 0.
- With the Streaming API, entities are automatically included.
The media entity
An array of media attached to the Tweet with the new Twitter Photo Upload feature. Each media entity comes with the following attributes:
id | the media ID (int format) |
id_str | the media ID (string format) |
media_url | The URL of the media file (see the `sizes` attribute for available sizes) |
media_url_https | The SSL URL of the media file (see the sizes attribute for available sizes) |
url | The media URL that was extracted |
display_url | Not a URL but a string to display instead of the media URL |
expanded_url | The fully resolved media URL |
sizes |
We support different sizes: thumb, small, medium and large. The media_url defaults to medium but you can retrieve the media in different sizes by appending a colon + the size key (for example: http://p.twimg.com/ARACoSZs_QA8BDB.jpg:thumb). Each available size comes with three attributes that describe it:
w: the width (in pixels) of the media in this particular size h: the height (in pixels) of the media in this particular size resize: how we resized the media to this particular size (can be crop or fit) |
type | only photo for now |
indices | The character positions the media was extracted from |
"text": "#Photos on Twitter: taking flight http://t.co/qbJx26r",
"entities": {
"media": [
{
"id": 76360760611180544,
"id_str": "76360760611180544",
"media_url": "http://p.twimg.com/AQ9JtQsCEAA7dEN.jpg",
"media_url_https": "https://p.twimg.com/AQ9JtQsCEAA7dEN.jpg",
"url": "http://t.co/qbJx26r",
"display_url": "pic.twitter.com/qbJx26r",
"expanded_url": "http://twitter.com/twitter/status/76360760606986241/photo/1",
"sizes": {
"large": {
"w": 700,
"resize": "fit",
"h": 466
},
"medium": {
"w": 600,
"resize": "fit",
"h": 399
},
"small": {
"w": 340,
"resize": "fit",
"h": 226
},
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
}
},
"type": "photo",
"indices": [
34,
53
]
}
],
"urls": [
],
"user_mentions": [
],
"hashtags": [
]
}
The urls entity
An array of URLs extracted from the Tweet text. Each URL entity comes with the following attributes:
url | The URL that was extracted |
display_url | (only for t.co links) Not a URL but a string to display instead of the URL |
expanded_url | (only for t.co links) The fully resolved URL |
indices | The character positions the URL was extracted from |
"text": "Twitter for Mac is now easier and faster, and you can open multiple windows at once http://t.co/0JG5Mcq",
"entities": {
"media": [
],
"urls": [
{
"url": "http://t.co/0JG5Mcq",
"display_url": "blog.twitter.com/2011/05/twitte…",
"expanded_url": "http://blog.twitter.com/2011/05/twitter-for-mac-update.html",
"indices": [
84,
103
]
}
],
"user_mentions": [
],
"hashtags": [
]
}
The user_mentions entity
An array of Twitter screen names extracted from the Tweet text. Each User entity comes with the following attributes:
id | The User ID (int format) |
id_str | The User ID (string format) |
screen_name | The User screen name |
name | The User's full name |
indices | The character positions the User mention was extracted from |
"text": "@rno Et demi!"
"entities": {
"media": [
],
"urls": [
],
"user_mentions": [
{
"id": 22548447,
"id_str": "22548447",
"screen_name": "rno",
"name": "Arnaud Meunier",
"indices": [
0,
4
]
}
],
"hashtags": [
]
}
The hashtags entity
An array of hashtags extracted from the Tweet text. Each Hashtag entity comes with the following attributes:
text | The Hashtag text |
indices | The character positions the Hashtag was extracted from |
"text": "Loved #devnestSF"
"entities": {
"media": [
],
"urls": [
],
"user_mentions": [
],
"hashtags": [
"text": "devnestSF"
"indices": [
6,
16
]
]
}
The symbols entity
An array of financial symbols starting with the dollar sign extracted from the Tweet text. Similar to hashtags, an entity comes with the following attributes:
text | The symbol text |
indices | The character positions the symbol was extracted from |
"text": "$PEP or $COKE?",
"entities": {
"hashtags": [],
"symbols": [
{
"text": "PEP",
"indices": [
0,
4
]
},
{
"text": "COKE",
"indices": [
8,
13
]
}
],
"urls": [],
"user_mentions": []
}