Presentation is loading. Please wait.

Presentation is loading. Please wait.

ViSiCAST 2001 Technical Audit 8 October 2001, Brussels Michele Wakefield - Project Manager, ITC.

Similar presentations


Presentation on theme: "ViSiCAST 2001 Technical Audit 8 October 2001, Brussels Michele Wakefield - Project Manager, ITC."— Presentation transcript:

1 ViSiCAST 2001 Technical Audit 8 October 2001, Brussels Michele Wakefield - Project Manager, ITC

2 The ViSiCAST Project Virtual Signing Capture Animation Storage and Transmission

3 Aims of ViSiCAST Project “…support improved access by deaf citizens to information and services in sign language”  user friendly methods to capture & generate signs  machine readable system to describe gestures ... preferred medium is sign language

4 Tessa at the Post Office  Using speech recognizer  Convert counter-clerk’s voice input to text  Generate sign stream from text  BSL – limited repertoire

5 Independent Television Commission Televirtual University of East Anglia The Post Office Royal Institute for Deaf People Instituut voor Doven Hamburg University Institut für Rundfunktechnik Institut National des Télécommunications ViSiCAST Consortium

6 Project Dimensions  Duration  Start: January 2000  Finish: December 2002  36 months  Total Costs  3770kECU total  2876kECU funding from EC

7 ViSiCAST Project Highlights  Prototype enabling text translation and direct synthesis of sign language gestures  Quality assessment support to other EU project  New TESSA system trial at Science Museum, London  Achieved BCS IT Award and Gold Medal  Innovative transmission assessment for broadcast TV  BBC seek to deliver a closed signing service for broadcast DTV  WWW Weather-forecaster with Virtual Signer available in 3 Sign Languages

8 WWW High Street Broadcast Evaluation Exploitation AnimationLinguistics ViSiCAST Project Structure Technology U ser Application Exploitation &Dissemination

9 Presentations by Core Streams o Technology: Animation & Linguistics  WP4 AnimationMark,Televirtual (10)  WP5 LinguisticsThomas, UH (10 )  User: Applications  WP1 BroadcastWerner,IRT; Francoise, INT (10)  WP2 WWWCorrie/Margriet, IvD (10)  WP3 Face to FaceStephen, UEA(10)  Exploitation & Dissemination  WP7,8 Michele (10)

10 Technology Focus Objectives  WP 4 Animation  Increased realism in sign generation  Enhanced signing experience  WP5 Sign Language Linguistics  Use of natural sign language  Synthesis of sign language gestures

11 Animation: Initial Work  Developed TESSA & VISIA Avatars  Developed Capture / Animation system  Integrated into early demos of WPs 1-2-3

12 Animation Work: Objectives  WP4:  Develop Hi-Resolution Avatars + related capture, animation and transmission formats inc. compression  To enable and support application development in WPs 1-2-3 using WP4 (& WP5) Product.  To further develop, compare and integrate both proprietary and standard solutions, where appropriate

13

14 Animation: Current Work  Through Year Two  Continued to support Application development  Continuous upgrade to VISIA / TESSA player (Open GL renderer under Active X control)  Bug fixing / Motion capture support .baf format and compression layer with WP1 to create Broadcast Demonstrator using Vsicast system  MPEG compatability / parallel development in WP4 and applications

15

16 Animation: Continuing & future Work  Working on ways to improve facial animation / realism (forehead / eyes)  Exploring Statistical Methods to define and generate facial Animation  Working on ways to facilitate Avatar creation (Photographic acquisition)  Mask 2 + Improved Mo Cap

17

18

19

20  MPEG-4 SNHC for interoperable animation  MPEG-4 SNHC player and server delivered in June 2001 server delivered in June 2001 MPEG-4 compliant Animation Achievements à 5 to 25 kbit/s à 7 to 14 bit/vertex  Making use of a MPEG-4 compliant Visia model  Compliance with VRML standard (H-Anim specifications) specifications)  Incorporating a full compression layer  3D mesh & texture encoding  Motion parameters (BAP/FAP) encoding  Implementing importation and editing tools  Open delivery interface: MPEG-2, IP, ATM...

21  Advanced interoperable distributed animation system  Improved facial animation  MPEG-4 System layer implementation  Multimedia (audio, video, text…) synchronisation  Error resilience  Management of scene description  MPEG-compliant SiGML-driven animation  Open input/output interface MPEG-4 compliant Animation Perspectives

22 Presentation by Streams - Linguistics  WP 4 Animation  Increased realism in sign generation  Enhanced signing experience  WP5 Sign Language Linguistics  Use of natural sign language  Synthesis of sign language gestures

23 WP 5: Language Technology  Goal within the project:  To provide semi-automatic translation from English into BSL, DGS, NGT  Can also be used to assist the user in monolingual language input  No writing system for sign languages established

24 The last year: 3 deliverables  D5-1: Defining the interfaces  D5-2: Transfer to XML:  SiGML definition  D5-3 Prototype translation system:  English to notation

25 D5-1: Defining the interfaces  Adaptation of Discourse Representation Structure  Extension of HamNoSys, a phonetic transcription system for sign language  Notation conventions for all non-manual aspects relevant for (European) sign languages  Body movement  Head movement  Facial expressions  Mouthing and Mouth gestures  Eye movement  Synchronicity with manual elements

26 D5-2: SiGML  Defines XML domain based on D5-1 manual and non-manual notation  Simple timing model  Probably to be revised to ease integration with upcoming synchronisation models as required for broadcasting etc.  SMIL, XMT (MPEG4) etc.

27 D5-3: Proto text-to-sign notation  English to semantics (DRS)  CMU Parser  DRS construction  Semantics to sign language notation  DRS to HPSG semantics (ALE/MRS)  HPSG generation (ALE/LinGo)  HPSG PHON (HamNoSys) to SiGML

28 HPSG modelling of sign languages  Aiming at proper sign language, not anything like SEE  No detailed grammars published, no usable dictionaries  Most importantly: Data-driven  Lexicon and every aspect of our grammar fragment

29 Example: Verifying details

30 Demo: D5-3 plus D4-2  Due month 26 (Feb 02), i.e. work in progress  Complete route from English to sign language animation

31  Convert avatar-independent SiGML to avatar-specific description:  Define all SiGML locations (shoulder, eyes, fingertip, etc.) in terms of the avatar's geometry  Define hand shapes in terms of rotations of the hand joints  Determine arm joint rotations from hand positions by inverse kinematics  Convert SiGML movements into numerically defined trajectories  Output in BAF format or VRML Synthetic Animation of SiGML

32  Model each joint by a second-order control system  a muscle applies a torque to the joint, resisted by a moment of inertia and damping  Generate different types of motion (fast, slow, etc.) by varying the model parameters Biocontrol model

33  If only hands, arms, and face are animated, the result is stiff and lifeless.  Animate the spine and head by mixing “ambient motion” from motion capture files with synthetic animation. Ambient motion

34  An alternative route to creating animations  Every important physical feature of a sign is notated in Hamnosys, guaranteed to be reproduced in the animation  precise contacts between hands  relationship between hands and body  Any avatar can be targeted at low additional cost Usefulness of synthetic animation

35 Closing the feedback loop  So far, only the native signers involved in the project can judge the output of our HPSG generation system  Requires intimate knowledge of HamNoSys at least  With the animation output, we have access to the native signers’ intuition of much more people than today  Opens the way to more formal evaluation of the generation system than is available to date

36 Summary: Language Technology  First successful steps in HPSG language modelling and translation of English to sign language  Encoding established and extended sign language notation with standard description model (XML)  Already close to closing the feedback loop to allow native signers evaluation of our language production system

37 Presentation by Streams  Animation and Linguistics  User Applications : Evaluation of broadcast transmission for DTV  Exploitation and Dissemination

38 User Applications Objectives  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

39 Presentation by Streams - Television  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

40  Low transmission rate < 25 kbit/s  Compatibility with signing on other media and foreign deaf languages foreign deaf languages  Precise, sharp representation of signer  Open display options  Compliance with international standards: MPEG, DVB  Future-proof:  cost saving  allows vast no. of signed programmes  no transition from video-based to VH signing VH on TV: The Advantages

41  Integrated TX system for broadcast to STBs  demonstrator complete end of 2000  Implementing virtual human s/w in STB  Incorporating a compression layer  Using MPEG-2 delivery layer for maximum compliance:  with existing hardware  with MPEG & DVB standards  with proprietary formats Broadcast VH Signing: Achievements

42 Broadcast VH Signing: Functional architecture MUXPacket MPEG-2AVencoder MPEG-4SNHCencoder BAFencoder MPEG-2AVdecoder MPEG-4SNHCdecoder BAFdecoder MPEG-4SNHCplayer BAFplayer COMPOSE dePacket deMUX EncoderDecoder Compositor SystemSystem Delivery normative proprietary MPEG-2TS

43 Broadcast VH Signing: System layer implementation UDP/TCPpacketiser ThomsonMPEGencoder RFmodulator DVB receiver card IPfilter SystemSystem Delivery EncoderDecoder Compositor MPEG-2TS

44 MPEG-2 Transport Stream (TS) MPEG-2 Packetized Elementary Stream (PES) Section PES Broadcast VH Signing: Versatile delivery architecture BAF AudioVideo FlexMUX Scene desc. AudioVideo SNHC MPEG-4MPEG-2 Proprietary SiGML Text MPEG-7 Content description Content description Coding Delivery DVB compliant DVB compliant

45  Advanced TX system for broadcast to STBs  Open, MPEG & DVB compliant architecture  Improved synchronisation layer  Integrating a compositing layer  Implementing a complete MPEG-4 multimedia player  Integrating SiGML stream Broadcast VH Signing: Perspectives

46 MPEGCompositor Broadcast VH Signing: Targeted architecture MUXPacket MPEG-4SNHCencoder BAFencoder MPEG-4SNHCdecoder BAFdecoder Multimediaplayer dePacket deMUX EncoderDecoder Compositor SystemSystem Delivery normative proprietary MPEG-2TS MPEG- MPEG-AVencoder 24 AVdecoder 24

47 Presentation by Streams - WWW  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

48 Weather Forecast Application  First WWW application: daily weather forecast in 3 sign languages  content creation  example forecast  evaluation

49 Creation of content  Source: forecast in free text  Tool for semi-automatic conversion  manual standardisation of text  automatic generation sign languages  Result: 3 webpages  English/BSL & Dutch/SLN & German/DGS

50 Demo

51 Evaluation with Deaf users  Subjective quality of signing rated as ‘reasonable’ or ‘good’  68% correct or partially correct  Improvement possibilities  mouthing  facial expressions

52 Mouthing Scores for signs depending in various degrees on mouthing

53 Facial Expressions Scores for signs depending in various degrees on facial expressions

54 Next Steps  Improvements  Beta-testing  on line  larger user group  user feedback  Exploitation planning

55 Presentation by Streams – Face to Face  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

56 WP3: Face-to-face transactions  Research concentrated on TESSA (Text and Sign Support Agent)  Enables Post Office counter clerks to “translate” from (English) speech to sign language  System developments:  Autumn 2000: New system software completed, incorporating IBM “Via Voice” speech recognition and improved avatar  Spring 2001:  200 new signs recorded, processed and added to system  Spring/Summer 2001: Development and testing of “unconstrained system”

57 First System using Constrained Speech Recognition

58 “Unconstrained” Speech System

59 Demo

60 Testing the Speech Recognition Accuracy of the Unconstrained System  Single speaker  200 “constrained” phrases  Three recording conditions:  studio microphone in acoustic booth  boom microphone I in lab  boom microphone II in Science Museum Post Office  Three conditions for recogniser:  Untrained  Acoustic models fully trained on boom microphone II in lab  Acoustic and language models fully trained

61 Speech recognition accuracy of unconstrained system

62 Language Processing I aaboutaccessaccount..youyou’veyour 0010001200100000010001000010....................................... Co-occurrence matrix of Words versus Phrases Phrases 1,2 & 3…... …..Phrase n

63 Language Processing II  Entry W(i,j) in matrix is transformed to:  Given M words output by recogniser, score for each phrase is computed as: Normalised average uncertainty about phrase p j given word w i Compresses value of entry Scores above a threshold T are displayed to PO clerk in a list

64 Testing the Phrase Retrieval Accuracy of the Unconstrained System  10 speakers  For each speaker and each of 200 phrases:  record one utterance of the “constrained” phrase  ask speaker to write down another way of expressing the phrase  record speaker saying this phrase  Training of recogniser not possible for 10 different speakers  Hence measure phrase retrieval accuracy on text of unconstrained phrases only

65 Phrase recognition Results on Text of Alternative Utterances Average accuracy = 73.3%

66 Future Work  Unconstrained System  Investigate use of partial string matching of word sequences and phoneme sequences  Investigate use of Latent Semantic Analysis  Add spoken language(s) translation  Sign recognition  Collect data  Configure baseline system

67 Exploitation and Dissemination Highlights  Exploitation and Dissemination  BBC Collaboration for closed signing solution for broadcasting DTV  TESSA BCS IT Award & Gold Medal  WWW Weather Forecasting in 3 European Sign Languages  Close Involvement of Deaf People

68 Dissemination Highlights  November 2000: TESSA wins British Computer Society Gold Medal for IT  February 2001: TESSA exhibited at Royal Society  March 2001: TESSA appears on “Computer Club” (German TV)  July–September 2001: TESSA on exhibition at Science Museum, London  October 8th 2001: TESSA appears on “Blue Peter” (BBC TV)  November 2001: TESSA on show at COMDEX, Las Vegas

69 Exploitation Highlights / Short Term  Bandwidth efficient closed signing  Excessive in-vision signing disliked by hearing people  Impacts on DTT multiplexes where bit-rate is already at a premium  BBC investigation of closed signing for DTV  Demonstration of Avatar-based signing  Body suit capture technologies

70 Short Term- WWW strategy  Give away basic web browser  Sell SiGML authoring tool presented  De facto standard

71 Exploitation Highlights Medium to Long Term  Conversion of subtitles  high % of programmes subtitled  supports wide range of deaf signing languages  subtitles translated in set top box  overcomes spectrum capacity & scheduling restrictions  Requirements:  reliable unconstrained translator  next generation DVB-compliant STB with in-built signing decoder


Download ppt "ViSiCAST 2001 Technical Audit 8 October 2001, Brussels Michele Wakefield - Project Manager, ITC."

Similar presentations


Ads by Google