Make video and audio accessible

Videos, audio files, and any other synchronized or time-based media need alternatives to make them accessible to more people. Visual content needs to be described for people who are blind or have limited vision. Auditory content needs to be described for people who are deaf or hard of hearing.

Transcripts, captions, and audio descriptions allow more people to access your media. Considering these alternatives during the planning, scripting, and creation process will help you create more accessible media from the start. Keep the following suggestions in mind:

  • Limit background noise to help people hear the speech or main audio.
  • Avoid flashing and blinking light to not cause seizures.
  • Present the media on an accessible player.

For in-depth information, see the Web Accessibility Initiative (WAI) resource on media alternatives and other considerations

The following will help you choose which alternatives you need for your media.

Transcripts

You must provide a transcript or other text alternative for audio-only media. While not required for video with audio, a descriptive transcript is strongly recommended to allow access to your media for the most people possible. Transcripts let everyone access a text version of the media, including people who are deaf or blind. HTML transcripts are also important for search engines, which can index the transcript, improving search engine optimization (SEO).

A simple transcript includes the speech and non-speech audio information. A more thorough, descriptive transcript also includes visual information. You can use the closed captions and audio description script to start producing a transcript. Transcripts can be linked from a separate page, integrated on the same page, or interactive. Although you can use PDF or other document types for transcripts, HTML is the preferred format.

Captions

You must provide captions for all live video and most prerecorded video. Examples of caption types include closed captions, open captions, and subtitles.

Captions should be synchronized and accurate, presenting the same content available in the audio track. The captions should not cover important images, including other text in the video, and should have enough color contrast to be visible.

Captions allow people who are deaf or hard of hearing access to the audio information. People in quiet or noisy environments, non-native speakers of the language, and people listening to an unfamiliar accent also benefit from closed captions.

Audio descriptions

You must provide audio description for your videos if the visual information is not included in the audio. Types of descriptions include live audio description and extended audio description.

The audio description should be inserted in natural pauses in the narration or dialog near the time the referenced action happens. Extended audio description, requiring the video and audio tracks to be programmatically paused to insert the description, may be used when the audio description doesn't fit into natural pauses in the narration or dialog. A carefully planned video can be designed to include natural pauses where the audio descriptions can be inserted with few interruptions.

Audio descriptions help people who are blind or visually impaired. Audio descriptions should be a separate, selectable track if possible. If multiple audio tracks are not possible, then the audio description version of the video can be linked.

Decision tree

This decision tree will help you determine which alternatives you will need for your live or prerecorded media.

Graphic version of the decision tree

Flow diagram of the decision tree for live and pre-recorded media

Text version of the decision tree

What does your time-based media include?

  1. Live video with synchronized audio
    1. You must provide live generated closed captions of the speech and non-speech audio; see WCAG 2.1 SC 1.2.4: Captions (Live). Live audio description is strongly recommended if key visual content will otherwise not be verbalized.
  2. Prerecorded audio only
    1. If there is already text on the screen that accurately reflects the audio, no further steps are needed.
    2. If not, you must provide a transcript of the speech and non-speech audio; see WCAG 2.1 SC 1.2.1: Audio-only and Video only (Prerecorded).
  3. Prerecorded video only
    1. You must provide synchronized audio descriptions or a text alternative that presents the information in the video; see WCAG 2.1 SC 1.2.1: Audio-only and Video only (Prerecorded).
  4. Prerecorded video with synchronized audio
    1. Is there text on screen that accurately reflects both the speech and non-speech audio?
      1. If yes, captions are not needed. Continue to part b to check if audio description is needed.
      2. If no, you must provide accurate synchronized closed captions of speech and non-speech audio; see WCAG 2.1 SC 1.2.2: Captions (Prerecorded). Continue to part b to check if audio description is needed.
    2. Does the audio already describe the important visual elements? For example, text on screen is spoken as it appears.
      1. If yes, audio description is not needed. A transcript is strongly recommended but not required; see WCAG 2.1 SC 1.2.8: Media Alternative (Prerecorded).
      2. If no, you must provide synchronized audio description; see WCAG 2.1 SC 1.2.5: Audio Description (Prerecorded). If there is not enough time during pauses in dialog or narration for complete descriptions, you may use extended audio descriptions; see WCAG 2.1 SC 1.2.7: Extended Audio Description (Prerecorded). A transcript is strongly recommended but not required; see WCAG 2.1 SC 1.2.8: Media Alternative (Prerecorded).

Common use cases

Your multimedia files must have a visual equivalent for any audio and an audio equivalent for any visual content. But applying that rule can be confusing. Some videos may not have audio, or the video may be slides of spoken text. And what about a "talking head" video where a single person at a podium gives a speech?

A good way to think about these situations is to consider what parts of your multimedia someone may or may not have access to.

Live video

Live video, including video conferences and webinars, should have closed captions generated on the screen in real time. Live audio description is strongly recommended if visual content is included, such as slides, and will not be described in the existing audio.

Prerecorded audio

Whether your audio file is a piece of music, a podcast, or any other audio-only media, a transcript or other text alternative is necessary.

Examples include:

  • A web page includes an audio recording of a song with lyrics. The lyrics appear below the audio file.
  • A web page for a podcast episode includes a transcript on the page or through a link.
  • A web page includes an audio recording of instrumental music along with a text description of the recording, for example, "String quartet playing Johann Pachelbel's Canon in D".

Prerecorded video

Videos commonly have both visual and audio information. Think of a movie or television show, for example. These types of videos need closed captions and audio descriptions. A descriptive transcript is strongly recommended. But not all videos are this straightforward.

Examples of other types of videos include:

  • A "talking head" video has closed captions for speech and non-speech audio. A description, such as "Jane Smith's speech to the freshman class", is clearly indicated in the surrounding text on the page. The page links to a transcript of the speech.
  • A video of a song has slides of the lyrics, highlighted as they are sung. When the video starts, a single caption indicates that music is playing. An audio description track is not included since all the visual information is in the audio track. The page links to a transcript of the lyrics.
  • A scientific animation with no sound is embedded on a web page. The page provides a link to an audio-described version. The surrounding text on the page describes the actions in the animation.
  • A video with no audio content includes a caption in the video or surrounding text on the page indicating that the video has no audio content. The page provides a link to an audio-described version.
  • A video with a soundtrack that has a specific purpose (for example, sets a mood, music video, etc.) indicates the purpose of the soundtrack in the closed captions, possibly including the song title, artist, and lyrics (if any). Audio description is also provided.

Glossary

  • Audio description: An audio narrative track that fits between the dialog, describing actions, locations, facial expressions, body language, and on-screen text, enabling viewers to keep up with visual elements on screen.

  • Closed captions: Synchronized visual or text alternative for both speech and non-speech audio information that can be turned on or off. Closed captions exist as a separate text stream and are available to various assistive technologies.

  • Extended audio description: Audio description that doesn't fit into natural pauses in the narration or dialog, requiring the video and audio tracks to be programmatically paused to insert the description.

  • Live audio description: Audio description produced and transmitted as a live event happens.

  • Open captions: Open captions are always in view and cannot be turned off. They are part of the video itself.

  • Subtitles: Synchronized text translation of the speech content into a different language.

  • Synchronized media: Audio or video synchronized with another format for presenting information or with time-based interactive components. Examples include a video with a synchronized audio track or a slideshow with a synchronized prerecorded soundtrack.

  • Text alternative: A transcript or other correctly sequenced document such as a screenplay that accurately represents the video or audio content.

  • Time-based media: Live or prerecorded media that has time as a component, such as an audio recording, a video, or audio or video combined with interaction.

  • Transcript: A document including correctly sequenced text descriptions of time-based visual and auditory information.

This is document avhj in the Knowledge Base.
Last modified on 2023-10-04 10:58:46.