Captions, Transcripts, and Audio Descriptions


Accessible multimedia (visual and auditory content that is synchronized) must include captions—text versions of speech and other important audio content—allowing it to be accessible to people who can't hear all of the audio.

According to US government figures, one person in eight has some functional hearing limitation, and this number will increase as the average age of the population increases. Beyond people with disabilities, captioning helps people who only partially understand the language presented. Captions are also useful in noisy environments like airports, in quiet environments like libraries, and for multimodal learning.

All multimedia content with speech should have accessible captions that are:

  • Synchronized to appear at approximately the same time as the corresponding audio.
  • Equivalent to the spoken words and other audio information.
  • Accessible, or readily available, to those who need it.

Captions as typically seen on television
Screenshot of The Tonight Show Starring Jimmy Fallon television broadcast. Captions display on the image.

The most common type of captions are "Closed" captions, which can be turned on or off. Most countries require most pre-recorded and live television programs to be closed-captioned.

Closed captioning of most pre-recorded television programs is now a legal requirement in most countries. Most live broadcasts (such as news and sports events) and most pre-recorded programs now include closed captions that can be easily enabled and viewed on screen.

Captions as seen on DVD or Blu-ray
Screenshot from movie Avatar. Captions appear on screen - You crossed the line.

On broadcast television, the style and location of the captions depend on the caption decoder built into the viewer's television receiver or streaming device. In online or streaming video, the browser or video player determines how captions will be displayed. Many decoders and video players allow the user to customize caption size, color, font, and location on the screen.

Captions as seen in a web media player
Screenshot of captions in a web media player

Open captions include the same content as closed captions, but the captions are a permanent part of the video picture and cannot be turned off. The captions are visible to anybody viewing the video clip. This gives the media producer total control (and the user no control) over the way the captions appear, including caption location, size, color, font, and timing.


Also see our article on real-time captions for information on captioning live web multimedia and broadcasts.


For multimedia, a transcript can also help users who can neither hear the audio nor see the video. Beyond the spoken words, a transcript should include descriptions of important audio information (like laughter) and visual information (such as someone entering the room). Transcripts help deaf/blind users interact with content using refreshable Braille devices.

Transcripts also allow anyone that cannot access content from either web audio or video (or both) to read a text transcript instead. For most web video, both captions and a text transcript should be provided. For content that is audio only, a transcript will usually suffice—captions are not necessary for audio-only media like a podcast.

Transcripts make multimedia content searchable by search engines and users. Screen reader users also may also prefer a transcript over real-time audio, since most proficient screen reader users set their assistive technology to read at a rate much faster than natural human speech.


In order to be optimally accessible to users with auditory disabilities, web multimedia should include both synchronized captions and a transcript.