YouTube API v2.0 – Captions

Note: The YouTube Data API (v2) has been officially deprecated as of March 4, 2014. Please refer to our deprecation policy for more information.

Video introduction

Adding captions to your video files can help users to locate and understand your videos. You can add captions to a video via the API by creating a caption track and uploading it. YouTube supports a variety of caption file formats, including plain text transcripts.

YouTube may also use automatic speech recognition (ASR) to generate a caption track for a video.

This section discusses the caption file formats that YouTube supports. It also explains how to use the API to create, retrieve, update and delete caption files. These issues are explained in the following subsections:

  1. Requirements for caption operations
  2. Supported formats for caption files
  3. Retrieving a list of available caption tracks
  4. Creating a caption track
  5. Retrieving a caption track
  6. Updating a caption track
  7. Deleting a caption track

Requirements for caption operations

Please note the following requirements for executing caption-related API operations:

  • Captions are only available for API version 2.

  • Captions for a video can only be created, retrieved, modified and deleted by the owner of that video. To perform these operations for a video, you must submit authenticated API requests for which the video's owner is the logged-in user. Accordingly, API requests for caption operations must contain a properly formatted Authorization header.

  • Requests to create (POST), update (PUT) or delete (DELETE) captions must identify your developer key using either the X-GData-Key request header or the key request parameter.

Supported formats for caption files

YouTube supports many different formats for caption files, including RealText (.rt), SAMI (.smi) and Media RSS. If you already have captions available, we recommend that you upload them in their original format, whatever that may be. If you do not have formatted caption data, such as a transcript that does not have timing data, we recommend using SubRip (*.SRT) or SubViewer (*.SUB) for generating formatted captions.

YouTube also supports a simple format that is compatible with both the SubRip and SubViewer formats. In this simple format, each caption is divided into three segments that appear in the following order:

  1. Timecodes specify the time and duration that YouTube should display a caption in HH:MM:SS.FS format. Timecodes, which are measured from the beginning of the video, contain the following units

    • HH – Hours (00, 01, etc.)
    • MM – Minutes (00-59)
    • SS – Seconds (00-59)
    • FS – Fractional seconds (.000-.999)

    YouTube supports the following time constructs:

    • HH:MM:SS.FS,HH:MM:SS.FS – A caption appears at the first time offset and stops displaying at the second time offset. This format is compatible with the SubViewer format.
    • HH:MM:SS.FS --> HH:MM:SS.FS – A caption appears at the first time offset and stops displaying at the second time offset. To make this format completely compatible with the SubRip format, you can insert a "subtitle number" before the timecodes.
    • HH:MM:SS.FS – A caption appears at the designated time offset. Since no stop time is specified, YouTube will try to determine an appropriate stop time. For example, the caption might stop displaying just before the next caption is scheduled to appear.

  2. The caption text consists of one or more lines of text that will be displayed on the screen during the time offsets. You must use UTF-8 encoding for the caption text.

  3. A blank line marks the end of each caption.

The following example demonstrates this simple caption format:

0:01:23.000,0:01:25.000
This text displays for two seconds starting
1 minute and 23 seconds into the video.

0:02:20.250,0:02:23.8
This text displays from 2 minutes and 20.25 seconds after the start
of the video until 2 minutes and 23.8 seconds after the start of the video.

0:03:14.159
This text displays beginning 3 minutes and 14.159 seconds
after the start of the video for an undefined length of time. 

Retrieving a list of available caption tracks

Each video entry contains a <link> tag for which the rel attribute value is http://gdata.youtube.com/schemas/2007#video.captionTracks. (Note that this link is only visible to the owner of a video.) The tag's href attribute identifies the captions URL for the video.

<link rel="http://gdata.youtube.com/schemas/2007#video.captionTracks"
     type="application/atom+xml"
     href="https://gdata.youtube.com/feeds/api/videos/ZTUVgYoeN_b/captions"
     yt:hasEntries="true|false"/>

Caption tracks are available for a video if the <link> tag's yt:hasEntries attribute has a value of true. If caption tracks are available for the video, you can retrieve the list of tracks by sending a GET request to the URL in the link tag. You can use any of the following input parameters in a request to retrieve caption tracks:

  • lr – This parameter filters the list of caption tracks by language. For example, if you include lr=de in your request, the API response will only list caption tracks in the German language. The parameter value must be an ISO 639-1 two-letter language code. By default, the API response will include tracks for all languages.
  • max-results – This parameter specifies the maximum number of caption tracks to include in the results set.
  • start-index – This parameter specifies the index of the first caption track that should be included in the result set. The parameter uses a one-based index, meaning the first track is 1, the second track is 2 and so forth.

Sample caption tracks feed

The following sample API response shows a caption tracks feed with two entries. The first entry contains a <yt:derived> tag that indicates that the track was generated using automatic speech recognition.

<feed xmlns='http://www.w3.org/2005/Atom' 
  xmlns:app='http://www.w3.org/2007/app'
  xmlns:openSearch='http://a9.com/-/spec/opensearch/1.1/'
  xmlns:gd='http://schemas.google.com/g/2005'
  xmlns:yt='http://gdata.youtube.com/schemas/2007'
  gd:etag='W/"DkcNRH4zfSp7ImA9WxJXEUk."'>
  <id>tag:youtube.com,2008:videos:EdDc7sWjCL4:captions</id>
  <updated>2010-08-18T22:42:42.179Z</updated>
  <category scheme='http://schemas.google.com/g/2005#kind'
    term='http://gdata.youtube.com/schemas/2007#captionTrack'/>
  <title>
    Caption Tracks for Learn about HTML5 and the Future of the Web
  </title>
  <logo>http://www.youtube.com/img/pic_youtubelogo_123x63.gif</logo>
  <link rel='related' type='application/atom+xml'
    href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4?v=2'/>
  <link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml'
    href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions?v=2'/>
  <link rel='http://schemas.google.com/g/2005#batch' type='application/atom+xml'
    href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions/batch?v=2'/>
  <link rel='self' type='application/atom+xml'
     href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions?...'/>
  <link rel='http://schemas.google.com/g/2005#post' type='application/atom+xml'
    href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions?v=2'/>
  <link rel='service' type='application/atomsvc+xml' 
    href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions?alt=...'/>
  <author>
    <name>GoogleDevelopers</name>
    <uri>https://gdata.youtube.com/feeds/api/users/googledevelopers</uri>
  </author>
  <generator version='2.0' uri='http://gdata.youtube.com/'>YouTube data API</generator>
  <openSearch:totalResults>1</openSearch:totalResults>
  <openSearch:startIndex>1</openSearch:startIndex>
  <openSearch:itemsPerPage>25</openSearch:itemsPerPage>
  <entry gd:etag='W/"DkECQHk5cSp7ImA9WxJSGEw."'>
    <id>tag:youtube.com,2008:captions:ChkLEO3ZhwUaEAi-kYyt7J236BESAmVuGgAM</id>
    <published>2010-06-09T06:31:35.256-07:00</published>
    <updated>2010-06-09T06:31:35.256-07:00</updated>
    <app:edited>2010-06-09T06:31:35.256-07:00</app:edited>
    <category scheme='http://schemas.google.com/g/2005#kind'
      term='http://gdata.youtube.com/schemas/2007#captionTrack'/>
    <title/>
    <content type='application/vnd.youtube.timedtext'
      src='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captiondata/ChkLEO...'
      xml:lang='en'/>
    <link rel='self' type='application/atom+xml'
      href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions/ChkLEO...'/>
    <link rel='edit' type='application/atom+xml'
      href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions/ChkLEO...'/>
    <link rel='edit-media' type='application/vnd.youtube.timedtext'
      href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captiondata/ChkLEO...'/>
    <yt:derived>speechRecognition</yt:derived>
  </entry>
  <entry gd:etag='W/"DkECQHk5cSp7ImA9WxJSGEw."'>
    <id>tag:youtube.com,2008:captions:CiMLEO3ZhwUaGgi-kYyt7J236B...</id>
    <published>2010-06-20T23:46:22.156-07:00</published>
    <updated>2010-06-20T23:46:22.156-07:00</updated>
    <app:edited>2010-06-20T23:46:22.156-07:00</app:edited>
    <category scheme='http://schemas.google.com/g/2005#kind'
      term='http://gdata.youtube.com/schemas/2007#captionTrack'/>
    <title>auto-timed</title>
    <content type='application/vnd.youtube.timedtext'
      src='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captiondata/CiMLEO...'
      xml:lang='en'/>
    <link rel='self' type='application/atom+xml'
      href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions/CiMLEO...'/>
    <link rel='edit' type='application/atom+xml'
      href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captions/CiMLEO...'/>
    <link rel='edit-media' type='application/vnd.youtube.timedtext'
      href='https://gdata.youtube.com/feeds/api/videos/EdDc7sWjCL4/captiondata/CiMLEO...'/>
  </entry>
</feed>

Creating a caption track

To create a caption track, post a caption file or transcript to the http://schemas.google.com/g/2005#post link for the captions track feed. The following guidelines explain how to format your API request:

  • You must set the Content-Type header value to the following value:

    application/vnd.youtube.timedtext
  • You should use the Content-Language header to specify the language of the caption track. If you do not specify a language, the API server will attempt to identify the language. However, if the identification fails, then your API request will also fail.

    If you do specify the caption track language, then you must also specify the character set used so that YouTube can decode the file. We recommend that you use UTF-8 encoding whenever possible. To specify the character set, add the charset parameter to the Content-Type header value as shown below:

    application/vnd.youtube.timedtext; charset=UTF-8
  • You can use the Slug header to specify a title for the caption track. The header value should be URL-escaped. In addition, to specify non-ASCII characters in a track title, you need to ensure that the title is a UTF-8 string and then URL-escape it. For example, the string El mejor día (the best day) would be specified in a Slug header as El%20mejor%20d%C3%ADa.

  • The caption file must be 1MB or smaller.

The following sample API request demonstrates how to create a caption track:

POST /feeds/api/videos/VIDEO_ID/captions HTTP/1.1
Host: gdata.youtube.com
Content-Type: application/vnd.youtube.timedtext; charset=UTF-8
Content-Language: en
Slug: Title of caption track
Authorization: Bearer ACCESS_TOKEN
GData-Version: 2
X-GData-Key: key=DEVELOPER_KEY

<Caption File Data>

Uploading a video transcript without caption timecodes

If YouTube's Timed Text Service (TTS) does not recognize your caption file format, YouTube will try to automatically sync the caption track to the audio track in your video. This functionality enables you to upload a video transcript without any caption timecodes at all and have YouTube generate a caption track from the transcript. The transcript does not have to be formatted in any particular way and could just be paragraphs of text.

Sample API response for uploaded caption track

If YouTube successfully handles your request, the API will return a 201 HTTP response code. The Location header in the API response will contain the self link of the caption track entry, which can be used for later retrieval of the entry. The body of the XML response will be the caption track entry.

The XML below shows a successful response to a request to create a caption track:

HTTP 201
Content-Type: application/atom+xml; charset=UTF-8; type=entry
Location: https://gdata.youtube.com/feeds/api/captions/VIDEO_ID/captions/caption_track_id

<entry xmlns='http://www.w3.org/2005/Atom'
  xmlns:app='http://www.w3.org/2007/app'
  xmlns:gd='http://schemas.google.com/g/2005'
  xmlns:xml='http://www.w3.org/XML/1998/namespace'
  gd:etag='W/"DEMCSXc7fip7ImA9WxJXEk4."'>
  <id>tag:youtube.com,2008:captions:CiULE...</id>
  <published>2009-06-05T14:14:28.906-07:00</published>
  <updated>2009-06-05T14:14:28.906-07:00</updated>
  <app:edited>2009-06-05T14:14:28.906-07:00</app:edited>
  <category scheme='http://schemas.google.com/g/2005#kind'
    term='http://gdata.youtube.com/schemas/2007#captionTrack'/>
  <title>API captions</title>
  <content type='application/vnd.youtube.timedtext'
    src='https://gdata.youtube.com/feeds/api/videos/PjLv88-zqkI/captiondata/CiULE...'
    xml:lang='en'/>
  <link rel='self' type='application/atom+xml'
    href='https://gdata.youtube.com/feeds/api/videos/PjLv88-zqkI/captions/CiULE...'/>
  <link rel='edit' type='application/atom+xml'
    href='https://gdata.youtube.com/feeds/api/videos/PjLv88-zqkI/captions/CiULE...'/>
  <link rel='edit-media' type='application/vnd.youtube.timedtext'
    href='https://gdata.youtube.com/feeds/api/videos/PjLv88-zqkI/captiondata/CiULE...'/>
</entry>

If your request is properly formatted, but YouTube cannot process the captions file you are sending, YouTube will still create an <entry> in the captions feed for the new caption track. The entry will contain an <app:control> tag, which indicates that YouTube did not successfully handle the captions file. That tag, in turn, will contain a <yt:state> tag that contains more information about the failure and an <app:draft> tag that contains a value of yes, which indicates that the track is not publicly visible.

<app:control>
  <app:draft>yes</app:draft>
  <yt:state name="failed" reasonCode="invalidFormat"/>
</app:control>

Retrieving a caption track

Caption tracks are served automatically to the YouTube video players, which means that anyone watching a video can opt to view any caption track for that video. However, you can still retrieve a caption track if you want to modify the caption data or the timecodes that specify when the captions display.

To retrieve a caption track, send a GET request to the src URL specified in the <content> tag of the caption track entry:

<content type="text/xml" xml:lang="zh-HK"
  src="https://gdata.youtube.com/api/captiondata/caption-id"/>

The request below demonstrates how to retrieve a caption track:

GET /feeds/api/videos/VIDEO_ID/captiondata/TRACK_ID HTTP/1.1
Host: gdata.youtube.com
Authorization: Bearer ACCESS_TOKEN
GData-Version: 2
X-GData-Key: key=DEVELOPER_KEY

Note that the API does not return an Atom feed in response to a request for a caption track. Instead, the response contains your caption track in the same format that you uploaded it to YouTube. See the Supported formats for caption files section for details about valid caption file formats or the instructions for retrieving a caption track in an alternate format.

In addition, the following rules apply to generated caption tracks:

  • If you are retrieving a track that YouTube generated using automatic speech recognition (ASR), then you can also use the fmt parameter to specify that the track should be returned in Subviewer-compatible (sbv) or Subrip-compatible (srt) format. (In a caption tracks feed, the <yt:derived> tag indicates whether a track was generated using ASR.) If you do not specify a format when retrieving an ASR track, the API server will automatically return the track in Subviewer-compatible format.

  • If you are retrieving a track that YouTube automatically synchronized to your video, you can use the fmt parameter to specify the format in which which the track should be returned. If you do not specify a fmt parameter value, then the API will return the original transcript that you uploaded. Thus, you must specify a fmt parameter value to retrieve the track's timecodes.

The API supports two other options for retrieving caption tracks:

  1. Retrieving a translation of a caption track
  2. Retrieving a caption track in an alternate format

Note: YouTube does not save translated or reformatted caption tracks that you request through the API, and those tracks will not appear in the list of caption tracks available for a video. You can upload a translated or reformatted track, however, to make it available to viewers.

Retrieving a translation of a caption track

To retrieve a translation of a caption track, append the lang parameter to the src URL specified in the <content> tag of the caption track entry. Set the parameter value to the ISO 639-1 two-letter language code that identifies the desired caption language.

The request below demonstrates how to retrieve a translated caption track:

GET /feeds/api/videos/VIDEO_ID/captiondata/TRACK_ID?lang=ISO_639_CODE HTTP/1.1
Host: gdata.youtube.com
Authorization: Bearer ACCESS_TOKEN
GData-Version: 2
X-GData-Key: key=DEVELOPER_KEY

If your request is successful, the response will contain the translated caption track in the same format that you uploaded the original track to YouTube. If a translation cannot be generated, or if the language code that you specified is not a valid value, the API will return an HTTP 400 response code.

Retrieving a caption track in an alternate format

To retrieve a caption track in a different format than the one you originally uploaded, append the fmt parameter to the src URL specified in the <content> tag of the caption track entry. Set the parameter value to one of the following values:

  • sbv – Subviewer-compatible
  • srt – Subrip-compatible

The request below demonstrates how to retrieve a caption track in an alternate format:

GET /feeds/api/videos/VIDEO_ID/captiondata/TRACK_ID?fmt=TARGET_FORMAT HTTP/1.1
Host: gdata.youtube.com
Authorization: Bearer ACCESS_TOKEN
GData-Version: 2
X-GData-Key: key=DEVELOPER_KEY

You can specify a target language and a target format by specifying values for both the lang and fmt parameters.

If YouTube cannot convert the original caption track to the requested format, the API will return an HTTP 400 response code.

Updating a caption track

The API enables you to update the captions and timecodes for a caption track. However, you cannot update metadata about the track. So, for example, you cannot update the title or language of a caption track. If the title and/or language of a caption track is incorrect, you need to delete the track with the incorrect information and create a new track with the corrected title and language.

To update a caption track, send a PUT request to the edit-media link in the caption track entry:

<link rel="edit-media" type="application/atml+xml"
  href="https:/gdata.youtube.com/feeds/api/videos/VIDEO_ID/captiondata/CAPTION_ID" />

Your API request must set the Content-Type header value as described in the Creating a caption track section. In addition, your caption file must be 1MB or smaller. The request below demonstrates how to update a caption track:

PUT /feeds/api/videos/VIDEO_ID/captiondata/CAPTION_TRACK_ID HTTP/1.1
Host: gdata.youtube.com
Content-Type: application/vnd.youtube.timedtext; charset=UTF-8
Authorization: Bearer ACCESS_TOKEN
GData-Version: 2
X-GData-Key: key=DEVELOPER_KEY

<Binary Caption File Data>

The updated caption track must be in a supported format for caption files.

Deleting a caption track

To delete a caption track, send a DELETE request to the edit link in the caption track entry. The following sample API request demonstrates how to delete a caption track:

DELETE /feeds/api/videos/VIDEO_ID/captions/CAPTION_TRACK_ID HTTP/1.1
Host: gdata.youtube.com
Authorization: Bearer ACCESS_TOKEN
GData-Version: 2
X-GData-Key: key=DEVELOPER_KEY

pagination links

« Previous
Updating and Deleting Videos
Next »
Movies, Trailers, and Shows

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.