CaptionCaster

Note

CaptionCaster is not currently under development. WebAIM hopes to resume development at some time in the future, but at this point we do not have any plans or timeline for a release of CaptionCaster.

What is CaptionCaster?

CaptionCaster is a real-time captioning system. It is a software program that allows text captions to be streamed across the internet for any live broadcast. As long as the end user has an internet connection, he/she can receive the text version of the broadcast audio in real-time. It can be used with web casts, web conferencing, audio/telephone conferences, television and radio broadcasts, in-person lectures or presentations, and any other medium in which live audio is used.

Why CaptionCaster?

CaptionCaster allows audio content to be accessible to those who are deaf or hard of hearing and increases the accessibility and usability of audio broadcasts for all users. Benefits include:

  • Enables those who are deaf and hard of hearing to participate, thus meeting ethical and legal obligations.
  • Increases broadcast participation. Participants no longer must listen to the audio, but can read the text instead. This is useful when hardware may not support audio, when the environment is not conducive to audio (such as a public library, lab, or noisy environments), or when participants may want a more simple approach to getting broadcast content.
  • Increased understandability and retention through reading the content.

What Does CaptionCaster Do?

There are two primary difficulties when dealing with real-time captioning. The first is generating the captions in real-time (as or shortly after the words are spoken). The second is delivering the captions in real-time to the end user. CaptionCaster solves both issues.

Generating real-time captions

stenography machine CaptionCaster uses stenograph input to convert the spoken word to text. A stenograph machine is a special typewriter-like device that trained operators use to 'type' phonetically at a very fast rate. CaptionCaster receives this input from a stenographer and formats it for Internet broadcast use.

CaptionCaster can also use external voice recognition software for input. If the voice recognition software is well trained and the audio is captured in a suitable environment this will provide adequate quality for text broadcasting needs.

Delivering real-time captions

CaptionCaster processes the stenographer or voice recognition format, converts it into a format suitable for broadcast on the internet, then streams this text to broadcast participants. Participants can view the live text from nearly any internet-connected computer or hand held device, and even from some cell phones.

How Does CaptionCaster Work?

CaptionCaster works entirely in parallel with your broadcast media. This allows it to work with any type of audio delivery mechanisms, from in-person lectures and presentations to web casts, radio, and television broadcasts.

CaptionCaster has four primary components:

  1. CaptionCaster - Obtains, decodes, and broadcasts streaming captions to clients.
  2. CaptionClient - The primary client through which the end user receives captions.
  3. CaptionConnect - A post-broadcast tool used for correcting, synchronizing, and archiving captions.
  4. CaptionConvert - A tool for converting caption files between various caption formats.

Note

All screen shots and descriptions below are from very early alpha versions of the software. The interfaces and feature sets are not complete and will certainly change before a beta release.

CaptionCaster

The core component of the real-time captioning system.

The CaptionCaster interface showing input on the left, output on the right, and status on the bottom

  • Receives input from:
    • Stenographer (via local hardware connection, telephone line, or internet socket)
    • Keyboard
    • Voice recognition
    • Data file
    • Another CaptionCaster system
  • Decodes common stenographer formats and codes into a useful format
    • Allows custom formatting
    • Can custom set a captioning delay so that the streaming text and the encoded audio are synchronized
  • Generates output to:
    • CaptionClient - the primary client system (see below). Over 1000 simultaneous CaptionClient connections are possible.
    • Standard web page format accessible by any internet connected computer (including assistive technologies which allow live web casts to be accessible to those who are blind or deaf-blind)
    • Instant messenger networks (including cell phones and text message services)
    • Mobile/handheld devices
    • Media players (Windows Media and RealPlayer)
    • Other CaptionCaster systems (for load balancing or distributed networking)
  • Stores text archives
    • CaptionConnect formatted, time-encoded log (see below)
    • Plain text transcript

CaptionClient

CaptionClient

CaptionClient is the primary way in which users will view the real-time captions. It uses Adobe Flash technology that is installed on 95% of internet-enabled computers. The customizable display for optimal viewing environment includes accessibility controls for increased font size and customized colors/contrast.

CaptionConnect

CaptionConnect allows a very quick turnaround for generating accessible, captioned versions of your archived broadcast. At the conclusion of a CaptionCaster-enabled live broadcast, CaptionCaster will save a time-encoded log of the entire session in XML format. CaptionConnect allows you to fix any input mistakes, misspellings, timing inconsistencies, or synchronization issues. CaptionCaster allows editing, combining, or splitting of individual caption displays (a section of text that displays on screen at any given time). You can then easily synchronize the captions with an archived version of your broadcast media. CaptionConnect will then output the complete, time-encoded, synchronized captioning format for use in the following media formats:

  • DVD
  • RealText (for use with RealMedia content)
  • QTText (Quicktime)
  • SAMI (Windows Media Player)
  • SubViewer (Google Video)
  • TimedText (Adobe Flash)
  • Plain text transcript

In just a matter of minutes, you can generate and post accessible, captioned versions of your broadcast media. CaptionConnect can also open other common caption formats to allow editing, clean-up, and synchronization of most caption formats and conversion from one caption/media format to another.

CaptionConvert

CaptionConvert allows instant and easy conversion between many caption formats. It is useful if you are moving content from one media type to another and do not want to re-caption or resynchronize captions with the new media format. It retains basic formatting information between formats and can be configured to automatically convert an entire batch of caption files from one format to another. Unlike CaptionConnect, editing of captions is not allowed. However, you can use CaptionConvert to convert existing captions into a format compatible with CaptionConnect and then fully edit the captions and timing information.

Putting it all together

The four components of the CaptionCaster system provide a complete solution for internet based real-time captioning needs. The following schematic shows how the components work together.

CaptionCaster schematic - description is provided below

CaptionCaster receives stenographer, keyboard, or voice recognition input which is instantly transformed into formatted data. The broadcaster then transmits this text in real-time to clients, including CaptionClient, mobile devices and cell phones, and popular media players and instant messaging programs. After a broadcast, CaptionConnect allows you to clean up and generate a transcript and time encoded archives for media players, DVD, Google Video, Flash, etc. CaptionConvert allows conversion between many of these caption formats.

WebAIM is an initiative of:
Center for Persons with Disabilities (CPD) Utah State University