Presentation is loading. Please wait.

Presentation is loading. Please wait.

“Meet The Experts” on Captioning with Kevin Erler and Pat Brogan Automatic Sync Technologies June 25, 2009, 1 PM – 2 PM PDT Please use a headset or your.

Similar presentations


Presentation on theme: "“Meet The Experts” on Captioning with Kevin Erler and Pat Brogan Automatic Sync Technologies June 25, 2009, 1 PM – 2 PM PDT Please use a headset or your."— Presentation transcript:

1 “Meet The Experts” on Captioning with Kevin Erler and Pat Brogan Automatic Sync Technologies June 25, 2009, 1 PM – 2 PM PDT Please use a headset or your computer speaker. Chat will be turned off until the end of the presentation. Captions are provided by Bill Courtland This presentation is being recorded and you will receive an announcement of the archived location by July 1. If you have any questions or concerns, please email, Jean Wells, jwells@calstate.edu The Accessible Technology Initiative (ATI) Presents

2 © 2009 Automatic Sync Technologies 2 About the Presenters 30+ years professional engagement Dissertation and Research on rich-media & distance learning Worked with standards on elearning (SCORM, LOM, Accessibility) Adjunct marketing professor SCU Highly influenced by motherhood, non-profit work with at-risk youth Engineer by training 20+ years background in speech processing Focus on automation and cost reduction, without compromising quality Founder of Automatic Sync Technologies Pat Brogan Kevin Erler

3 © 2009 Automatic Sync Technologies 3 Agenda 1.Introduction to captioning 2.The captioning process 3.Assessing quality 4.Approaches to large scale captioning 5.Motivations for captioning 6.The media explosion 7.Choosing what to caption and transcribe 8.Funding issues

4 © 2009 Automatic Sync Technologies 4 A History of Captioning Captioning is a synchronized text representation of the audio component of a program. Sometimes called subtitling. In traditional broadcast media, it is sometimes called “Line21”. The captioning industry was created in the early 1980s by an FCC mandate for broadcast TV. The FCC mandate had a *slow* phase-in period that reached 100% only in January 2006. Today captioning can be applied to many different types of media, and many other regulations govern these different forms of media.

5 © 2009 Automatic Sync Technologies 5 Captioning in Education  Media is everywhere. The use of video in education is substantial and increasing  Captioning is starting to proliferate: finally, many educational institutions are starting to see the many benefits of captioning and are increasingly taking a proactive approach.  The value and impact extend beyond just making content accessible for the deaf. Most users are not deaf and hard of hearing, but use words to search, to reinforce language skills and to comprehend better.  Look at what we found: here are some of the dozens of examples we found of captioning in the educational environment.

6 © 2009 Automatic Sync Technologies 66

7 77

8 88

9 9 Creative Uses of Transcripts & Captions Students retain more if they are able to 'read ahead' and have more of the transcript visible 

10 © 2009 Automatic Sync Technologies 10 Captioned Recorded Lecture 10 

11 © 2009 Automatic Sync Technologies 11

12 © 2009 Automatic Sync Technologies 12 Captioned iPhone Apps

13 © 2009 Automatic Sync Technologies Transcripts 13 

14 © 2009 Automatic Sync Technologies 14 Some Captioning Terms Key element of captioning is that it is a synchronized text representation of the audio component of a program. o Transcription vs. Captioning o Subtitling vs. Captioning o Open vs. Closed captioning o Post vs. Real-time o Web vs. Broadcast

15 © 2009 Automatic Sync Technologies 15 The Transcription-Captioning Process Produce a transcript of the audio portion of the program. Need to observe certain conventions to represent non-dialog content. Divide text into captions, observing guidelines about where to break sentences. Some constraints here are dependent on the media type and the video size. Synchronize captions to the video timeline. Create output files in the format required by your media. Note that format is dictated by the type of player that the content will be played on, not by the media itself. Encode caption data into final media. This process varies widely depending on the media/player type.

16 © 2009 Automatic Sync Technologies 16 Transcription and Captioning Process NCI estimates these steps will take 10 to 16 labor hours per media hour to do using traditional captioning methods. How long it takes will depend on how much attention you pay to quality issues. You should assume it will not take less than 5 to 7 labor hours per hour of media.

17 © 2009 Automatic Sync Technologies 17 Encoding? For DVD media: the captions need to be written back to your DVD – this is done with an authoring package. For tape media: (eg: VHS) the captions need to be written out to your tape using a caption encoder. This can also be done using a software NLE system (in some cases). For traditional web media: the caption files are typically external and read by the player so no encoding is needed. Some allow you to embed the caption data (eg: QuickTime, Windows Media). For content portals: Typically, the caption file must be uploaded separately from the video file. For mobile devices: high variability. For iPods (and iTunes), you must embed the caption file using Apple tools before upload.

18 © 2009 Automatic Sync Technologies 18 Captioning Tools Professional Tools o CPC / MacCaption Do It Yourself o Magpie o SubtitleWorkshop Speech Recognition o CaptionMic Content Portals o CaptionTube o Overstream Services o NCI o Vitac o CaptionFirst (Realtime) o Automatic Sync

19 © 2009 Automatic Sync Technologies 19 Setting the context: Quality Quality is a continuum; trades off against cost and time. “100%” is unlikely – cost is too high. But what is acceptable? Your mission statement almost certainly includes “excellence”. Responding to OCR complaints: can you demonstrate due attention to serving the needs?

20 © 2009 Automatic Sync Technologies 20 Word Error Rate 0% Error Rate Everyone loves a booming market, and most booms happen on the back of technological change. The world's venture capitalists, having fed on the computing boom of the 1980s, the internet boom of the 1990s and the biotech and nanotech boomlets of the early 2000s, are now looking around for the next one. They think they have found it: energy. Many past booms have been energy-fed: coal-fired steam power, oil-fired internal-combustion engines, the rise of electricity, even the mass tourism of the jet era. But the past few decades have been quiet on that front. Coal has been cheap. Natural gas has been cheap. The 1970s aside, oil has been cheap. The one real novelty, nuclear power, went spectacularly off the rails. The pressure to innovate has been minimal. In the space of a couple of years, all that has changed. Oil is no longer cheap; indeed, it has never been more expensive. Moreover, there is growing concern that the supply of oil may soon peak as consumption continues to grow, known supplies run out and new reserves become harder to find. The idea of growing what you put in the tank of your car, rather than sucking it out of a hole in the ground, no longer looks like economic madness. Nor does the idea of throwing away the tank and plugging your car into an electric socket instead.

21 © 2009 Automatic Sync Technologies 21 Word Error Rate 10% Error Rate Boot hoses a booming market, gloved capote booms happen heart the back of technological change. The world's venture capitalists, house fed gem's the computing boom of the 1980s, the internet boom of the 1990s and the biotech and nanotech boomlets of the early 2000s, are now looking around for the road one. They gaunt they have found bubonic: energy. Many past booms have been energy-fed: coal-fired steam power, oil-fired internal-combustion engines, the rise of electricity, even the brushy tourism of the jet era. But the past few decades have been quiet on magic front. Coal has been cheap. Natural gas gross hoist cheap. Jennifer 1970s aside, oil has been cheap. The one real novelty, nuclear power, went spectacularly off tabloid rails. The burping to innovate has been minimal. In local space of a couple of years, all that has paycheck. Oil is no longer cheap; indeed, it has never been more expensive. Moreover, there is fizzled translogic that the supply of oil may soon peak as consumption rains to grow, known supplies run out and new reserves become zipper to find. The idea of growing what you put in the tank of your car, rather saber sucking it out of a hole in grim ground, no longer looks like economic madness.

22 © 2009 Automatic Sync Technologies 22 Word Error Rate 20% Error Rate Kazakhstan banter a booming estate, and most systemically happen on the back of technological bleed. The world's venture capitalists, Italians fed on seltzer computing boom kingdom the 1980s, the internet levy of paddy 1990s and the harder and nanotech boomlets of the early 2000s, eroded now looking around for the buckle one. They think they limitless methodology it: energy. Many coups booms have diastolic energy-fed: coal-fired steam power, oil-fired internal-combustion diaries, the rise of foxglove, mindful the mass tourism of the jet windchill. Pepper ascent past few decades pragmatic been quiet on that front. Sentences erupt gushers cheap. Natural gas has falsifying cheap. Untruths 1970s aside, oil has been ultranationalist. The one real hoax, nuclear power, kite spectacularly off the rails. The pressure to innovate has been minimal. In the tinted skinner's a couple of years, looking that has changed. Oil is no longer cheap; indeed, it has never been maximize farthingale. Moreover, there is growing concern that the supply of oil may soon peak as consumption continues to grow, known supplies run out and new reserves expensive actuary to find. The idea of growing what you put in gospel tank of chaffy car, rather than sucking it out of copayment hole in the ground, no longer looks like economic boat.

23 © 2009 Automatic Sync Technologies 23 Effect of Errors PredictedActual

24 © 2009 Automatic Sync Technologies 24 Error Rates for General Captioning SourceTypical Error Rate Result Trained Stenographer0.5% to 1%No problems Student transcriber1% to 5% (??) Expect to be worse than stenographer Speech Rec: trained3% to 5+%Varies from acceptable to poor Speech Rec: untrained20% to 40%Unintelligible

25 © 2009 Automatic Sync Technologies 25 Captioning Solutions Self-captioning (Do It Yourself, In-Sourcing) Speech Recognition Outsource Key considerations: Error rate, cost, timeliness, scalability

26 © 2009 Automatic Sync Technologies 26 Captioning Solutions Self-captioning Speech Recognition Outsource Key considerations: Error rate, cost, timeliness, scalability

27 © 2009 Automatic Sync Technologies 27 Self-Captioning Key issues: Quality and Scalability. Quality: students are rarely well trained for this task and turnover is high. Scalability: The amount of staff needed to caption on a large scale is high; high turnover compounds issues.

28 © 2009 Automatic Sync Technologies 28 In-Sourcing Costs Need to include:  Management, support staff  Equipment, space, overhead costs  Training and recruitment costs

29 © 2009 Automatic Sync Technologies 29 The Real Cost of In-Sourcing  Assume 200 hours of content per month, for 8 mon/yr  Total content of 1600 hrs per year  7.5 labor hours per content hour  Total labor: 12,000 hrs  Assume student available for 4 hrs/day, 3 days/wk, 30 wks/yr  Each student yields 360 labor hrs/yr; 34 students needed  Assume one supervisor per 10 students  Labor requirements:  34 students  4 supervisors  1 technical support staff

30 © 2009 Automatic Sync Technologies 30 The Real Cost of In-Sourcing  Capital Depreciation: $5,000 (5 work stations; 5 yr depreciation)  Annual Support: $2,500 (10% support contract)  Training cost: $29,250 (50% turnover per yr; $1500 to train)  Student Labor: $144,000 ($12/hr)  Supervisors: $96,000 ($18/hr)  Support Staff: $33,333 ($50k/yr, for 8 months)  Benefits/Overhead: $98,400 (36%)  Facilities: $21,960 (1000 sq ft @ $21.96/sq ft/yr)

31 © 2009 Automatic Sync Technologies 31 The Real Cost of In-Sourcing  Capital Depreciation: $5,000 (5 work stations; 5 yr depreciation)  Annual Support: $2,500 (10% support contract)  Training cost: $29,250 (50% turnover per yr; $1500 to train)  Student Labor: $144,000 ($12/hr)  Supervisors: $96,000 ($18/hr)  Support Staff: $33,333 ($50k/yr, for 8 months)  Benefits/Overhead: $98,400 (36%)  Facilities: $21,960 (1000 sq ft @ $21.96/sq ft/yr) 7.5 labor hrs per video hr: $430,443 or $269 / video hr. 5 labor hrs per video hr: $181 / video hr.

32 © 2009 Automatic Sync Technologies 32 Captioning Solutions Self-captioning Speech Recognition Outsource Key considerations: Error rate, cost, timeliness, scalability

33 © 2009 Automatic Sync Technologies 33 Speech Rec Key issue is error rate. 3 key factors affect error:  Speaker (trained vs untrained; goat vs sheep)  Task domain (Topic)  Acoustic Environment (mic, noise, background, etc)

34 © 2009 Automatic Sync Technologies 34 Fixing the Speech Rec Output 2 approaches:  Pre-processing: train to the speaker(s)  Post-processing: edit the transcripts to repair

35 © 2009 Automatic Sync Technologies 35 Fixing the Speech Rec Output Pre-processing:  Very difficult to get faculty to participate.  Often training is not possible: Guest lecturers, 3 rd party video, one-off recordings.  Still need to deal with goats and noisy recordings.

36 © 2009 Automatic Sync Technologies 36 Fixing the Speech Rec Output Post-processing:  For untrained recognition (20%+ WER), cost to repair is higher than cost to start from scratch.  Point at which it becomes more cost effective to repair: less than 3% WER.  Using students to conduct repairs creates the same issues outlined under in-sourcing.

37 © 2009 Automatic Sync Technologies 37 Cost of Repairing a Bad Transcript Example data only

38 © 2009 Automatic Sync Technologies 38 Consider the Total Cost On the surface, speech rec solutions look appealing from a cost perspective; but consider the total cost of the solution:  Capital cost for initial system; consider provisioning.  Training cost (if you choose to train speakers)  Repair costs for each show (see In-Sourcing cost structure)

39 © 2009 Automatic Sync Technologies 39 Captioning Solutions Self-captioning Speech Recognition Outsource Key considerations: Error rate, cost, timeliness, scalability

40 © 2009 Automatic Sync Technologies 40 Outsourcing Solves the Accuracy issue, but:  Cost?  Workflow?  Reliability?  Scalability?  Service level? (speed)  Can they keep pace with technology?

41 © 2009 Automatic Sync Technologies 41 Outsourcing Small firms tend to offer better cost structures, but most cannot offer scalability if you have a lot of material. Large firms can better handle large volumes, but costs are generally much higher.

42 © 2009 Automatic Sync Technologies 42 Conclusion CostSpeedWorkflowAccuracy In-Sourcing Speech- Rec (raw) Speech-rec (repaired) Out- Sourcing

43 © 2009 Automatic Sync Technologies Why Caption and Transcribe? Compliance with system, state and federal mandates Improve access to learning materials Provide content appropriate for different learning styles Support at-risk students (DSS, ESL) Make content more discoverable and reusable, optimize search engine performance 43

44 © 2009 Automatic Sync Technologies Transcripts Needed to generate captions Have value for all students Searchable Can launch audio and video Can obviate the need for sending some note-takers to support DSS students 44

45 © 2008 Automatic Sync Technologies Captions and transcript text can be used as meta-data for SEO (search engine optimization) Can work with variety of tools: Google video, AST search, Reelsurfer CNET captioned video drove 30% increase in Google hits Captions Improve: Searchability, Discoverability, Navigability

46 © 2009 Automatic Sync Technologies 46 Captioning Learning Outcomes Research “Augmenting an auditory experience with captions more than doubles the retention and comprehension levels.” Gary Robson, The Closed Captioning Handbook Adult students that used captioned video presentations progressed significantly better than those using traditional literacy techniques. Benjamin Michael Rogner, Adult Literacy: Captioned Videotapes and Word Recognition Dual Coding Theory postulates that both visual and verbal information are processed differently and along distinct channels with the human mind creating separate representations for information processed in each channel. Allan Paivio, University of Western Ontario Multi-Modal Learning: See It, Hear It, Do It, Master It. Use 2 or more senses to avoid sensory overload (Granström, House, & Karlsson 2002, Clark & Mayer 2003)

47 © 2009 Automatic Sync Technologies 47 Learning Outcomes: SFSU Study American Indian studies class, 2007 Instructional video materials delivered randomly to students-50% with captions 50% without Two trends emerged: No captions: students were quite passive and silent during class discussions - with the usual "usual speakers" dominating the conversation and generalizations were pervasive. o With captions: students were more engaged and responsive to the questions asked about the film. In a similar vein, students made interesting analogies to their everyday lives and reference to specific information and events from the video was much more abundant. The most exciting of all was the correlation between this usage of captions and the students' grades with an average increase of 1 full GPA for students exposed to captions. Source: And Captions For All? A Case Study of the Relevance of Using Captions in a College Classroom by Robert Keith Collins, Assistant Professor, American Indian Studies

48 © 2009 Automatic Sync Technologies 48 Learning Outcomes: SJSU User Feedback on Captioning Better Absorption of Material o “It helped me to catch words that I didn't understand, and also helped with spelling.” o “It allows me to ‘pause’ the lecture and take notes from the captions when my note-taking lags behind the spoken lecture.” o “I caught several things the second time around reading captions that I did not listening the first time around.” Allows Better Interactivity with Course Material o “I much prefer the captioned lectures and being able to look at the links while you are talking. So far this has been the BEST online class I've taken at SJSU, others should learn from your example.”

49 © 2009 Automatic Sync Technologies 49 SJSU User Feedback on Captioning Diversifies Delivery of Video Media o “I was able to ‘read’ at my desk without having the audio turned on so that others in my office wouldn't be bothered.” o “Captions also allow you to view videos when you are in a situation where you are not able to use sound.”

50 © 2009 Automatic Sync Technologies Campus Captioning Considerations Defining policy: what content, which audiences, quality metrics, process Identifying responsible owners of content and of captioning process Integrating into learning strategy Selecting approaches & vendors Facilitating procurement of resources Workflow automation Budget 50

51 © 2009 Automatic Sync Technologies CSU Leads the country in accessibility policy Created ATI Evaluated options and vendors Facilitated procurement with AST Task force looking at content policies, effectiveness metrics 51

52 © 2009 Automatic Sync Technologies Rethinking The Accommodation Model Proactive vs. reactive o Ask student which options will help them learn best? Example: Deaf students choices: –Sign language interpreter –CART system –Recorded lecture with transcripts/captions –Note takers Systemic solutions vs. one –at-a-time Communicate programs 52

53 © 2009 Automatic Sync Technologies 53 Scope of “Captionable” Media University Communications (Promo and news videos) Distance Learning materials, Podcasts Recorded classes and learning objects Material posted in content portals VHS/DVD library archives Broadcast productions Special Event videos Student content

54 © 2009 Automatic Sync Technologies 54 Content Portals iTunes U o 250+ universities, 175K educational content items, 58M users YouTube o 160+ Universities, 30K videos Campus LMS; media servera Lecture Capture systems Academic Earth, Facebook, Twitter 54

55 © 2009 Automatic Sync Technologies 55 Why Use Content Portals? Extensive adoption=distribution Minimal training/ end user support Inexpensive Ubiquitous, cross-platform and devices Adds value to brand Creates framework to sell content 55

56 © 2009 Automatic Sync Technologies 56 Prioritizing Captioning Projects Critical Accommodations Distance Learning classes and materials Public information Training materials Events, communications Recorded lectures 56

57 © 2009 Automatic Sync Technologies Prioritization Decision Factors Time/urgency Budget Workflow Expected usage frequency Audience o Internal/external Primary purpose o Review or core instruction 57

58 © 2009 Automatic Sync Technologies 58 Funding Captioning Grants Centralized funding Pay to download : iTunes U and YouTube have infrastructures o UW Study says students will pay for content o Charge external community for downloads Cost recovery through student fee assessment o UNLV approach Sponsorship (advertising)

59 © 2009 Automatic Sync Technologies The Desired Outcome Media that is: Accessible Compliant Valuable to all audiences Reusable Discoverable So that learning outcomes improve! 59

60 “Meet The Experts” on Captioning with Kevin Erler and Pat Brogan Automatic Sync Technologies June 25, 2009, 1 PM – 2 PM PDT Produced by: Jean Wells, CSU ATI Captions by: Bill Courtland, broadcaster@ibsu.net 805.368.2802broadcaster@ibsu. Guests: Kevin Erler & Pat Brogan Thank You For Attending! And Thank You Kevin and Pat!


Download ppt "“Meet The Experts” on Captioning with Kevin Erler and Pat Brogan Automatic Sync Technologies June 25, 2009, 1 PM – 2 PM PDT Please use a headset or your."

Similar presentations


Ads by Google