Presentation is loading. Please wait.

Presentation is loading. Please wait.

ViSiCAST 2002 Technical Audit 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC.

Similar presentations


Presentation on theme: "ViSiCAST 2002 Technical Audit 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC."— Presentation transcript:

1 ViSiCAST 2002 Technical Audit 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC

2 The ViSiCAST Project Virtual Signing Capture Animation Storage and Transmission

3 Aims of ViSiCAST Project “…support improved access by deaf citizens to information and services in sign language”  by successfully developing signing systems for  broadcast, WWW & ‘over the counter’ type applications  user friendly methods to capture & generate signs  machine readable system to describe gestures ... preferred medium is sign language

4 Independent Television Commission Televirtual University of East Anglia The Post Office Royal National Institute for Deaf People Instituut voor Doven Hamburg University Institut für Rundfunktechnik Institut National des Télécommunications ViSiCAST Consortium

5 Project Dimensions  Duration  Start: January 2000  Finish: December 2002  36 months  Total Costs  3770kECU total  2876kECU funding from EC

6 ViSiCAST Project Highlights  Signing transmissions demonstrated at IBC 2002  MPEG-4 compliant INT-IRT demonstrator to deliver an open signing service for broadcast DTV  BBC demonstrator to deliver closed DTV signing service  Translate simple sentences in real time to sign animation  WWW Weather-forecaster launched in the Netherlands  Interactive sign language learning tool  2nd trial of TESSA system now nationwide and RNID re- promoting ViSiCAST  after success of pilot at Science Museum, London  encouraging national media coverage

7 Internet Community Broadcast Evaluation Exploitation AnimationLinguistics ViSiCAST Project Structure Technology U ser Application Exploitation &Dissemination

8 Presentations by Core Streams o Technology: Animation & Linguistics  WP4 Animation  WP5 Linguistics  User: Applications  WP2 Sign Tutor  WP1 Broadcast  WP2 WWW  WP3 Face to Face  WP6 Usability  Exploitation & Dissemination

9 Presentations by Core Streams o Technology: Animation & Linguistics  WP4 AnimationMark, TV (5); Ralph, UEA (5) including demonstrations of synthetic animation and new avatar  WP5 LinguisticsThomas, UH (5 )  User: Applications  WP2 Sign TutorThomas (5)  WP1 BroadcastFrancoise, INT (10)  WP2 WWWCorrie IvD (10)  WP3 Face to FaceMike, UEA(10)  WP6 UsabilityMel, RNID (5)  Exploitation & Dissemination  WP7,8 Nick, BBC (10)

10 Presentation by Streams - Animation  WP 4 Animation  Increased realism in sign generation  Enhanced signing experience  WP5 Sign Language Linguistics  Use of natural sign language  Synthesis of sign language gestures

11 Animation Work: Objectives  WP4:  Develop Hi-Resolution Avatars + related capture, and animation  To enable and support application development in WPs using WP4 (& WP5) Product  To further develop, compare and integrate both proprietary and standard solutions, where appropriate, in networked environments

12 Technology: WP4, Animation  At start of Year:  Visia 2  Running in Mask 1  Using Motion Capture Data only  Reasonable animation, expression etc.

13 Technology: WP4, Animation  Visia 2 in MPEG-4  Mesh partitioned into anatomical segments  MPEG-4 compliant authoring tool  Animation editing tool  Server-client tool for TX of animation parameters  MPEG-4 SNHC player <25fps  Embedded within an MPEG-4 set-top box

14 Technology: WP4, Animation  Visia 3  Updated Virtual Human  Higher resolution & polygon count, more realistic photographic textures  Improved articulation  Mesh distortion applied to garments  Facial expression via skeleton manipulation & morphs  Speech Enabled

15 Technology: WP4, Animation  Visia 3  New host software - Mask TNG  Writing new Active X Controls  Superior functionality, lighting and Camera FX, image quality, frame rate, flexibility etc.  >75 FPS

16 Technology: WP4, Animation  Visia 3  Running in Mask TNG Graph

17 Technology: WP4, Animation  Facial Morphs  Created in Maya, exported to Mask TNG  Based on Sign Language expressions (BSL Dictionary)  Inter-operable  Variable weighting (<100%+)  May be used with Mo-Cap data or for synthetic sign

18 Technology: WP4, Animation  Facial Animation - Experimental Work  Tracking of Active Shape Models  Tracking of Active Appearance Models

19 Technology: WP4, Animation  Facial Animation - Experimental Work  Vision-based motion capture of facial expressions using MPEG-4 compliant templates.

20 WP4: Synthetic Animation - Introduction  Task:  Make avatar do signing synthetically  as specified by ViSiCAST’s Signing Gesture Markup Language - SiGML  Motive:  Synthetic animation is more flexible than animation via motion-capture - “just write some more SiGML”  Support Natural-Language-to-Animation strategy of WP4-5  In broadcasting applications: put synthetic player on receiver and transmit SiGML - very low bandwidth

21 WP4: Synthetic Animation - Context  Televirtual Avatar is a deformable textured Mesh  Mesh shape and position are determined by configuration of underlying Skeleton  skeleton configuration: a.k.a. “Bone-Set”  To animate avatar: need to generate stream of Bone-Sets - one per frame of animation  i.e. BAF data stream- BAF = “Bones Animation Format”  Data intensive: 4Kb per bone-set

22 WP4: Synthetic Animation - Technical Approach  SiGML specifies gestures through:  Postures:  hand shape  hand orientation - palm and extended finger direction  position of hand(s) in signing space  Motions - straight-line, circular, zig-zag etc.  Synthetic Animation Engine:  specifies hand bone configuration for given posture  configures arm/shoulder bones using Inverse Kinematics  implements transition from one posture to next using non- linear interpolation - often via control system modelling

23 WP4: Synthetic Animation - Progress (i)  Initial Prototype (D4-2) delivered  Supported most of manual SiGML  Implemented in Perl (interpreted scripting language)  BAF/VRML output to file - and then to avatar  Relatively slow - often < 15 fps  Perl module packaged as ActiveX control  relatively unwieldy architecture  Enhancements for (M5-11)  BAF data stream cached in memory-fed directly to avatar  Front-end(for WP5): HamNoSys input server, with built-in HamNoSys-to-SiGML translation

24 WP4: Synthetic Animation - Progress (ii)  HamNoSys-to-Signing (Fast)  Synthetic Animation Engine re-implemented in C++  50 times faster - generates approx fps, supporting real- time streamed input (e.g. Broadcast, WWW)  More flexible framework - basis for improved authenticity  Modular system architecture - supports flexible application development, scripting in WWW pages, etc.  Upgrade to Mask  Interface to new primitive Mask2 ActiveX control  allows better control of animation frame scheduling  BAF replaced by VBM (ViSiCAST Bones and Morphs) - provides framework for support of non-manual SiGML

25 Presentation by Streams - Linguistics  WP 4 Animation  Increased realism in sign generation  Enhanced signing experience  WP5 Sign Language Linguistics  Use of natural sign language  Synthesis of sign language gestures

26 WP 5: Language Technology  Goal within the project:  To provide semi-automatic translation from English into BSL, DGS, NGT  Can also be used to assist the user in monolingual language input  No writing system for sign languages established

27 Presentation by Streams  Animation and Linguistics  User Applications  Exploitation and Dissemination

28  WP2 Sign Tutor  WP1 Television  Closed signing for Broadcast DTT  WP2 Internet  Information and Education for Sign Language Learners  WP3 Face to Face  High Street Post Office Counter Services  WP6 Comparison of virtual signing  with video-recorded Human Signing Presentation by Streams - Sign Tutor

29 Presentation by Streams - Television  WP2 Sign Tutor  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Sign Language Learners  WP3 Face to Face  WP6 Comparison of virtual signing

30  Low transmission rate < 25 kbit/s  Compatibility with signing on other media and sign languages  Precise, sharp representation of signer  Open display options  Compliance with international standards: MPEG, DVB  Future-proof:  cost saving  allows vast no. of signed programmes  unified framework from video-based to VH signing Virtual Humans on TV : The Advantages

31  Integrated TX system for broadcast to STBs  Implementing virtual human s/w in STB  MPEG-2 delivery layer for maximum compliance:  with existing hardware  with MPEG & DVB standards  with proprietary formats  MPEG-4 Audio-Video codec and player  MPEG-4 compliant virtual human  MPEG-4 SNHC virtual human codec and player  MPEG-4 based closed signing service demonstrated at IBC 2002 Broadcast VH Signing: Achievements

32 MPEG-4SNHCencoder MPEG-4videoencoder MPEG-2AVencoder MPEG-2AVdecoder Compositor MUXPacket BAFencoder MPEG-4multimediaplayer dePacket dePacket deMUX Encoder Decoder System Compositor normative proprietaryMPEG-4SNHCdecoder BAFdecoder ProprietaryMultimediaplayer Broadcast VH Signing: Functional architecture MPEG-2 TS Delivery MPEG-4videodecoder

33 Broadcast VH Signing: System layer implementation UDP/TCPpacketiser IRT-DSPMPEGencoder RFmodulator DVB receiver card IPfilter SystemSystem Delivery EncoderDecoder Compositor MPEG-2TS

34 Broadcast VH Signing: Perspectives  Advanced TX system for broadcast to MHP compliant STBs MHP compliant STBs  Open, MPEG & DVB compliant architecture  Improved synchronisation layer  Integrating a compositing layer  Implementing an enriched MPEG-4 multimedia authoring tool authoring tool  Integrating SiGML stream

35 Demonstration

36  Integrated TX system for broadcast to STBs  demonstrator completed end of 2002  Implementing virtual human s/w in STB  Incorporating a compression layer  Using MPEG-2 delivery layer for maximum compliance:  with existing hardware  with MPEG & DVB standards  with proprietary formats Broadcast VH Signing: Achievements

37 Broadcast VH Signing: Perspectives  Advanced TX system for broadcast to MHP compliant STBs MHP compliant STBs  Open, MPEG & DVB compliant architecture  Improved synchronisation layer  Integrating a compositing layer  Implementing an enriched MPEG-4 multimedia authoring tool authoring tool  Integrating SiGML stream

38 Presentation by Streams - WWW - Web pages with signing Field trials  WP2 Sign Tutor  WP1 Television  Closed signing for Broadcast DTT  WP2 Internet  Information and Education for sign language learners  Web-pages with signing  WP3 Face to Face  High Street Post Office Counter Services  WP6 Comparison of virtual signing

39 weather signs avatar content provider forecast creation tool user ‘play list’ Internet web-browser + plug-in 1rst DEMO2nd DEMO Weather Forecast Application

40 Demo

41  Hosting at site of Dutch Deaf organisation Dovenschap:  Running from end-June until end-October  Deaf users can join the field trial by filling in a form on the website  CD-rom with necessary software sent to users The field trials with Deaf users

42 Field Trial Promoted  70 s to webmasters of Deaf clubs, Deaf schools, Deaf organisations and private sites of Deaf persons  promotion on Teletext (T.V.)  on informative websites for Deaf people  visit at meeting of national Deaf organisation with 12 member organisations  article in magazine for sign language interpreters  30 CD-roms sent to Deaf clubs and schools

43 Trial Feedback  Helpdesk, contacted by  Discussion page on website  Evaluation form: software and installation, included with receiving software  Evaluation form: avatar and sign language, will be sent end of October 2002

44 Present Situation  Field trial still running  News slowly spreading  Positive reactions  Results at the end of November

45 Presentation by Streams – Face to Face  WP2 Sign Tutor  WP1 Television  Closed signing for Broadcast DTT  WP2 Internet  WP3 Face to Face  High Street Post Office Counter Services  Close involvement with RNID  WP6 Comparison of virtual signing  with video-recorded Human Signing

46 WP3 Overview  Evaluation – October 2001  New TESSA system – Mar 2002  Post Office Trial – May 2002 – Present  Sign Recognition – April 2002 – Present

47 Evaluation – October 2001  Evaluation conducted at PO concept store using TESSA V3.  10 Deaf People and 5 Counter Clerks participated over 10 days.  Mirror of previous evaluation + Some comparative tests of virtual signing with a video recorded human signer (full details in WP6 presentation)

48 Evaluation – Observations  Clerks complained about the speed of transactions  Caused by :  Toggle switch for recogniser  Mis-recognitions caused by large vocabulary  Poor mapping from recognised speech to phrases  Cumbersome graphical interface

49 Tessa V4 – Recognition System  ‘Bag of words’ language model. – Only words relevant to post office phrases recognised – Many fewer insertion errors – More resilient to external noise Hello Where Goodbye Going First Second Class …

50 TESSA V4 – Phrase Mapping  Phrase mapping system derived from work on Automatic Call Routing  Represent each of the signed phrases and the test phrase as vectors in a co- occurrence matrix A About Access Account You you’ve Your Phrase 1 Phrase 2 Phrase 3 Phrase N

51 TESSA V4 – Phrase Mapping  Weight the entry W(i,j) such that : More details in S. Cox. “Speech and Language Processing for a Constrained Speech Translation System”. In Proc. Int. Conf. On Spoken Language Processing. October 2002 M.Lincoln and S.Cox. “A Comparison of Language Processing Techniques for a Constrained Speech Translation System” (Submitted ICASSP 2003) Calculate distance between vectors representing each canonical phrase and input phrase.Calculate distance between vectors representing each canonical phrase and input phrase.

52 TESSA V4 - Mapping Evaluation  Subset of 155 phrases.  5 Talkers, each asked to  write down another way of expressing the phrase  record speaker saying this phrase  Recognise speech (NB No Adaptation)  75.1% Correct ; 49.8% Accurate  Test phrase mapping on both text and recognised speech

53 TESSA V4 – Mapping Evaluation

54 TESSA V4 – User Interface Push to talk (automatic end of speech detection) Larger Buttons Common Phrases which don’t need to be spoken Continually updated list of top 5 most used signs

55 Post Office Trial - Set-up  Tessa V4 used  5 Post Offices  London, Bristol, Derby, Liverpool, Wolverhampton  Known Deaf Communities In Each Area  3 Months Duration  Equipment Given Health Safety Approval  Trained 19 Counter Clerks  Provided Help Desk Support

56 Post Office Trial - Survey  Independent Survey Customers by RNID  Independent Survey of Counter Clerks  All Users Given RNID Questionnaire  All Counters Clerks Interviewed

57 Post Office Trial - Publicity  BBC See Hear – early Oct  Channel 4 – Documentary on BSL  Disability Times – 1 October 2002  BBC Worldwide – 24 August 2002  ITV London Tonight – 21 August 2002  Liverpool Echo – 1 August 2002  Camden Chronicle – 1 August 2002  Wolverhampton Chronicle – 25 July 2002

58 Post Office Trial - Publicity  Bristol Evening Post – 22 July  Liverpool Echo – 19 July  Derby Evening Telegraph – 18 July  Wolverhampton Express and Star – 17 July

59 Sign Language recognition  Preliminary investigation  6 Gestures, 10 training and 5 testing examples  Single user  Motion captured data  HMM recognition system  Initial results – 95% accuracy

60 Sign Language Recognition  Comparison of recognition using motion captured data and video.  Collaboration with EU ‘WISDOM’ project.  Currently Recording and editing multiuser database.  10 signs, 10 training and 5 testing examples  5 users  Motion captured and video  RNID to make independent evaluation of recognition accuracy.

61 Presentation by Streams – Usability of Virtual Signing  WP2 Sign Tutor  WP1 Television  Closed signing for Broadcast DTT  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  WP6 Comparison of virtual signing  with video-recorded Human Signing

62  60 phrases from the PO TESSA system signed by human interpreter on video  120 phrases signed by the virtual human  10 profoundly deaf people whose first language is BSL  Outcome measures:  Accuracy of identification  Subjective ratings for each phrase  Overall subjective ratings Methods

63 Accuracy of identification

64 Subjective Ratings LowHighVery easyVery difficult

65 Visual Analogue Scales

66 Usability Conclusions  Higher accuracy of identification for human than virtual signed phrases (  20%)  Some improvements in intelligibility of virtual signing required  Non-ceiling benchmark of accuracy determined  60% virtual signed phrases judged as good as human signed phrases  Greater scope for improvements in terms of subjective views of virtual signing  Impressive results for virtual signing

67 Exploitation and Dissemination Highlights  TESSA IT Awards & success in the community  WWW Weather Forecaster launched in 2 European Sign Languages & encouraging feedback  IvD & RNID host in UK and the Netherlands  Close Involvement of Deaf People  RNID promoting ViSiCAST nationally  BBC Collaboration for closed signing solution for broadcasting DTV for bandwidth efficiency  Increasing amount of in-vision signing disliked by hearing people  Impacts on DTT multiplexes where bit-rate is already at a premium

68 Exploitation & Dissemination  UK Government 10 year target - 5%programmes on DTT services to be signed  Today, services use ‘open signing’  Hearing viewers can find distracting  Seldom transmitted at peak viewing times  Closed signing offers freedom  for viewers - to turn on and off  scheduling freedom for broadcasters  but needs extra transmission feed  ViSiCAST uses ‘virtual human’  reducing bandwidth needs by factor of ten compared to video

69 Closed Signing – Why an avatar-based solution ?  MPEG2 coding (0.5-1Mbit/s)  only 1 service signed per multiplex if at all  MPEG4 coding (<350Kbit/s)  no more that 2 services signed per multiplex  more efficient compression, and ability to code non- rectangular objects  Animated Avatars (<100Kbit/s)  may be possible to sign all services in a multiplex  need new techniques to capture motion of real signers

70 Closed Signing Requirements for the Broadcaster  Be compatible with existing studio, distribution & monitoring infrastructures  maintain freedom to schedule as needed  accommodate live signing and reactive scheduling  allow for regional content insertion and time-shifting &  cope with the variety of picture display formats

71 Avatar Signing developments for broadcasting  Motion capture needs to be efficient and signer-independent  enabling signing of live and reactive broadcast material  best suited for offline broadcasting today  Facial motion capture needs refinements  Increasing realism make avatars more acceptable

72 Signing Capture - Studio Implementation Original Programme Monitor Camera SDI Coding / Compression Signing Data SDI inserter Ethernet SDI with embedded Signing Data Tape Video Server Signer Motion capture

73 Studio and distribution issues  Provision of television programme material with associated signing  Development of equipment for conveying signing data within studio infrastructure  We have developed hardware to add signing or motion capture data to a SDI video stream.  The main program video/audio, and the corresponding data can then be routed via standard studio infrastructure.  The combined A/V and signing data can also be stored on server or video tape  Development of DVB inserter agnostic of signing signal coding method  Development of end-to-end DT demonstrator


Download ppt "ViSiCAST 2002 Technical Audit 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC."

Similar presentations


Ads by Google