ViSiCAST 2001 Technical Audit 8 October 2001, Brussels Michele Wakefield - Project Manager, ITC.

Slides:



Advertisements
Similar presentations
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Annual Project Review 12 October 2000 IST
ViSiCAST: Virtual sign: Capture, Animation Storage & Transmission BDA Conference 2nd August 2000 Belfast Dr John Low RNID.
1 Synthetic animation of deaf signing Richard Kennaway University of East Anglia.
1 WP3: Synthesis and Animation John Glauert Background Interaction with other WPs Organisation of Work Resources Available Resources Needed Coordination.
1 Dicta-Sign Kick Off Meeting, Athens, 06 February 2009 University of East Anglia, UK Dicta-Sign Partner 3 Prof John Glauert Virtual Humans Group School.
EGov-Bus 1st year review Advanced eGovernment Information Service Bus eGov-Bus, FP6-IST STP 1st Formal Review Meeting , Brussels eGov-Bus.
Information Society Technologies Third Call for Proposals Norbert Brinkhoff-Button DG Information Society European Commission Key action III: Multmedia.
ViSiCAST SiGML Gesture Semantics, SiGML Signing Gestures – Semantic Model Ralph Elliott, SYS, UEA ViSiCAST
Russell Taylor. Sampling Sampled a file from an on-line/on-board source Edited that file by Deleting a section of the original file Added a Fade-in section.
ViSiCAST- 5 th General Consortium Meeting January RNID, London WP1 Progress MPEG-4 advanced broadcasting system INT.
Françoise PRETEUX, Unité de Projet ARTEMIS - INT - Evry France The ViSiCAST Project Vi rtual Si gning: C apture, A nimation, S torage & T ransmission IST.
ViSiCAST Consortium Meeting, January WP3 Report, September-December 2001.
ViSiCAST 2002 Technical Audit 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC.
BrightAuthor v3.7 software and BrightSign XD v4.7 firmware
                      Digital Audio 1.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP1. Project Management.
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
Stephen Cox, Michael Lincoln and Judy Tryggvason School of Information Systems, University of East Anglia, Norwich NR4 7TJ, U.K. Melanie Nakisa Royal National.
SiGML, Signing Gesture Mark-up Language, is the notation developed at UEA over the past three years to support the work of the EU-funded ViSiCAST and eSIGN.
EE442—Multimedia Networking Jane Dong California State University, Los Angeles.
Signing Matters 1 in 1000 people become deaf before they have acquired speech and may always have a low reading age for written English. Sign is their.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Signing Matters 1 in 1000 people become deaf before they have acquired speech and may always have a low reading age for written English. Sign is their.
Overview of Search Engines
MPEG-4 Cedar Wingate MUMT 621 Slide Presentation I Professor Ichiro Fujinaga September 24, 2009.
An innovative platform to allow translation and indexing of internet sites Localization World
Video Streaming © Nanda Ganesan, Ph.D..
Moving PicturestMyn1 Moving Pictures MPEG, Motion Picture Experts Group MPEG is a set of standards designed to support ”Coding of Moving Pictures and Associated.
TuniSigner: An avatar-based system to interpret SignWriting notations Yosra Bouzid & Mohamed Jemni Research Laboratory LaTICE, University of Tunis, Tunisia.
Signing for the Deaf using Virtual Humans Ian Marshall Mike Lincoln J.A. Bangham S.J.Cox (UEA) M. Tutt M.Wells (TeleVirtual, Norwich)
Assistive Technology and Education Mrs. G. Bacal Guidelines Designed for people who struggle to learn for different reasons, such as: learning disabilities,emotional.
1 Seminar Presentation Multimedia Audio / Video Communication Standards Instructor: Dr. Imran Ahmad By: Ju Wang November 7, 2003.
CapturaTalk4Android Demonstration Abi James
Help Desk System How to Deploy them? Author: Stephen Grabowski.
Introduction to Interactive Media The Interactive Media Development Process.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
The PrestoSpace Project Valentin Tablan. 2 Sheffield NLP Group, January 24 th 2006 Project Mission The 20th Century was the first with an audiovisual.
Zavod za telekomunikacije Igor S. Pandžić Department of telecommunications Faculty of electrical engineering and computing University of Zagreb, Croatia.
An Overview of ViSiCAST zVirtual Signing: Capture, Animation, Storage and Transmission zJohn Glauert, Andrew Bangham, Stephen Cox, Ralph Elliott, Ian Marshall.
CHAPTER TEN AUTHORING.
By NIST/ITL/IAD, Mike Rubinfeld, January 16, 2002 Page 1 L3 Overview L3 Standards Overview By Mike Rubinfeld Chairman, INCITS/L3 (MPEG & JPEG) NIST, Gaithersburg,
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
1 Mpeg-4 Overview Gerhard Roth. 2 Overview Much more general than all previous mpegs –standard finished in the last two years standardized ways to support:
Class 13 LBSC 690 Information Technology More Multimedia Compression and Recognition, and Social Issues.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Revision WP2 description ViSiCAST consortium meeting 5 & 6 July.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Annual Review Meeting - Introduction.
AIMS’99 Workshop Heidelberg, May 1999 Management of QoS using MPEG4 DMIF standard Amaro Sousa, Institute of Telecommunications, PT Guido Franceschini,
Digital Learning India 2008 July , 2008 Mrs. C. Vijayalakshmi Department of Computer science and Engineering Indian Institute of Technology – IIT.
Introduction to Interactive Media Interactive Media Tools: Authoring Applications.
AIMS’99 Workshop Heidelberg, May 1999 Assessing Audio Visual Quality P905 - AQUAVIT Assessment of Quality for audio-visual signals over Internet.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Language in Cognitive Science. Research Areas for Language Computational models of speech production and perception Signal processing for speech analysis,
Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
Glencoe Introduction to Multimedia Chapter 8 Audio 1 Section 8.1 Audio in Multimedia Audio plays many roles in multimedia. Effective use in multimedia.
Technology-enhanced Learning: EU research and its role in current and future ICT based learning environments Pat Manson Head of Unit Technology Enhanced.
Report on MPEG activities (WP4) Schema 5 th Technical Committee Meeting Ipswich, February 2004 Josep R. Casas, UPC.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
MPEG-4 Binary Information for Scenes (BIFS)
Visual Information Retrieval
An Overview of ViSiCAST
                      Digital Audio 1.
Overview What is Multimedia? Characteristics of multimedia
Pilar Orero, Spain Yoshikazu SEKI, Japan 2018
Statistical Information Technology
Presentation transcript:

ViSiCAST 2001 Technical Audit 8 October 2001, Brussels Michele Wakefield - Project Manager, ITC

The ViSiCAST Project Virtual Signing Capture Animation Storage and Transmission

Aims of ViSiCAST Project “…support improved access by deaf citizens to information and services in sign language”  user friendly methods to capture & generate signs  machine readable system to describe gestures ... preferred medium is sign language

Tessa at the Post Office  Using speech recognizer  Convert counter-clerk’s voice input to text  Generate sign stream from text  BSL – limited repertoire

Independent Television Commission Televirtual University of East Anglia The Post Office Royal Institute for Deaf People Instituut voor Doven Hamburg University Institut für Rundfunktechnik Institut National des Télécommunications ViSiCAST Consortium

Project Dimensions  Duration  Start: January 2000  Finish: December 2002  36 months  Total Costs  3770kECU total  2876kECU funding from EC

ViSiCAST Project Highlights  Prototype enabling text translation and direct synthesis of sign language gestures  Quality assessment support to other EU project  New TESSA system trial at Science Museum, London  Achieved BCS IT Award and Gold Medal  Innovative transmission assessment for broadcast TV  BBC seek to deliver a closed signing service for broadcast DTV  WWW Weather-forecaster with Virtual Signer available in 3 Sign Languages

WWW High Street Broadcast Evaluation Exploitation AnimationLinguistics ViSiCAST Project Structure Technology U ser Application Exploitation &Dissemination

Presentations by Core Streams o Technology: Animation & Linguistics  WP4 AnimationMark,Televirtual (10)  WP5 LinguisticsThomas, UH (10 )  User: Applications  WP1 BroadcastWerner,IRT; Francoise, INT (10)  WP2 WWWCorrie/Margriet, IvD (10)  WP3 Face to FaceStephen, UEA(10)  Exploitation & Dissemination  WP7,8 Michele (10)

Technology Focus Objectives  WP 4 Animation  Increased realism in sign generation  Enhanced signing experience  WP5 Sign Language Linguistics  Use of natural sign language  Synthesis of sign language gestures

Animation: Initial Work  Developed TESSA & VISIA Avatars  Developed Capture / Animation system  Integrated into early demos of WPs 1-2-3

Animation Work: Objectives  WP4:  Develop Hi-Resolution Avatars + related capture, animation and transmission formats inc. compression  To enable and support application development in WPs using WP4 (& WP5) Product.  To further develop, compare and integrate both proprietary and standard solutions, where appropriate

Animation: Current Work  Through Year Two  Continued to support Application development  Continuous upgrade to VISIA / TESSA player (Open GL renderer under Active X control)  Bug fixing / Motion capture support .baf format and compression layer with WP1 to create Broadcast Demonstrator using Vsicast system  MPEG compatability / parallel development in WP4 and applications

Animation: Continuing & future Work  Working on ways to improve facial animation / realism (forehead / eyes)  Exploring Statistical Methods to define and generate facial Animation  Working on ways to facilitate Avatar creation (Photographic acquisition)  Mask 2 + Improved Mo Cap

 MPEG-4 SNHC for interoperable animation  MPEG-4 SNHC player and server delivered in June 2001 server delivered in June 2001 MPEG-4 compliant Animation Achievements à 5 to 25 kbit/s à 7 to 14 bit/vertex  Making use of a MPEG-4 compliant Visia model  Compliance with VRML standard (H-Anim specifications) specifications)  Incorporating a full compression layer  3D mesh & texture encoding  Motion parameters (BAP/FAP) encoding  Implementing importation and editing tools  Open delivery interface: MPEG-2, IP, ATM...

 Advanced interoperable distributed animation system  Improved facial animation  MPEG-4 System layer implementation  Multimedia (audio, video, text…) synchronisation  Error resilience  Management of scene description  MPEG-compliant SiGML-driven animation  Open input/output interface MPEG-4 compliant Animation Perspectives

Presentation by Streams - Linguistics  WP 4 Animation  Increased realism in sign generation  Enhanced signing experience  WP5 Sign Language Linguistics  Use of natural sign language  Synthesis of sign language gestures

WP 5: Language Technology  Goal within the project:  To provide semi-automatic translation from English into BSL, DGS, NGT  Can also be used to assist the user in monolingual language input  No writing system for sign languages established

The last year: 3 deliverables  D5-1: Defining the interfaces  D5-2: Transfer to XML:  SiGML definition  D5-3 Prototype translation system:  English to notation

D5-1: Defining the interfaces  Adaptation of Discourse Representation Structure  Extension of HamNoSys, a phonetic transcription system for sign language  Notation conventions for all non-manual aspects relevant for (European) sign languages  Body movement  Head movement  Facial expressions  Mouthing and Mouth gestures  Eye movement  Synchronicity with manual elements

D5-2: SiGML  Defines XML domain based on D5-1 manual and non-manual notation  Simple timing model  Probably to be revised to ease integration with upcoming synchronisation models as required for broadcasting etc.  SMIL, XMT (MPEG4) etc.

D5-3: Proto text-to-sign notation  English to semantics (DRS)  CMU Parser  DRS construction  Semantics to sign language notation  DRS to HPSG semantics (ALE/MRS)  HPSG generation (ALE/LinGo)  HPSG PHON (HamNoSys) to SiGML

HPSG modelling of sign languages  Aiming at proper sign language, not anything like SEE  No detailed grammars published, no usable dictionaries  Most importantly: Data-driven  Lexicon and every aspect of our grammar fragment

Example: Verifying details

Demo: D5-3 plus D4-2  Due month 26 (Feb 02), i.e. work in progress  Complete route from English to sign language animation

 Convert avatar-independent SiGML to avatar-specific description:  Define all SiGML locations (shoulder, eyes, fingertip, etc.) in terms of the avatar's geometry  Define hand shapes in terms of rotations of the hand joints  Determine arm joint rotations from hand positions by inverse kinematics  Convert SiGML movements into numerically defined trajectories  Output in BAF format or VRML Synthetic Animation of SiGML

 Model each joint by a second-order control system  a muscle applies a torque to the joint, resisted by a moment of inertia and damping  Generate different types of motion (fast, slow, etc.) by varying the model parameters Biocontrol model

 If only hands, arms, and face are animated, the result is stiff and lifeless.  Animate the spine and head by mixing “ambient motion” from motion capture files with synthetic animation. Ambient motion

 An alternative route to creating animations  Every important physical feature of a sign is notated in Hamnosys, guaranteed to be reproduced in the animation  precise contacts between hands  relationship between hands and body  Any avatar can be targeted at low additional cost Usefulness of synthetic animation

Closing the feedback loop  So far, only the native signers involved in the project can judge the output of our HPSG generation system  Requires intimate knowledge of HamNoSys at least  With the animation output, we have access to the native signers’ intuition of much more people than today  Opens the way to more formal evaluation of the generation system than is available to date

Summary: Language Technology  First successful steps in HPSG language modelling and translation of English to sign language  Encoding established and extended sign language notation with standard description model (XML)  Already close to closing the feedback loop to allow native signers evaluation of our language production system

Presentation by Streams  Animation and Linguistics  User Applications : Evaluation of broadcast transmission for DTV  Exploitation and Dissemination

User Applications Objectives  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

Presentation by Streams - Television  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

 Low transmission rate < 25 kbit/s  Compatibility with signing on other media and foreign deaf languages foreign deaf languages  Precise, sharp representation of signer  Open display options  Compliance with international standards: MPEG, DVB  Future-proof:  cost saving  allows vast no. of signed programmes  no transition from video-based to VH signing VH on TV: The Advantages

 Integrated TX system for broadcast to STBs  demonstrator complete end of 2000  Implementing virtual human s/w in STB  Incorporating a compression layer  Using MPEG-2 delivery layer for maximum compliance:  with existing hardware  with MPEG & DVB standards  with proprietary formats Broadcast VH Signing: Achievements

Broadcast VH Signing: Functional architecture MUXPacket MPEG-2AVencoder MPEG-4SNHCencoder BAFencoder MPEG-2AVdecoder MPEG-4SNHCdecoder BAFdecoder MPEG-4SNHCplayer BAFplayer COMPOSE dePacket deMUX EncoderDecoder Compositor SystemSystem Delivery normative proprietary MPEG-2TS

Broadcast VH Signing: System layer implementation UDP/TCPpacketiser ThomsonMPEGencoder RFmodulator DVB receiver card IPfilter SystemSystem Delivery EncoderDecoder Compositor MPEG-2TS

MPEG-2 Transport Stream (TS) MPEG-2 Packetized Elementary Stream (PES) Section PES Broadcast VH Signing: Versatile delivery architecture BAF AudioVideo FlexMUX Scene desc. AudioVideo SNHC MPEG-4MPEG-2 Proprietary SiGML Text MPEG-7 Content description Content description Coding Delivery DVB compliant DVB compliant

 Advanced TX system for broadcast to STBs  Open, MPEG & DVB compliant architecture  Improved synchronisation layer  Integrating a compositing layer  Implementing a complete MPEG-4 multimedia player  Integrating SiGML stream Broadcast VH Signing: Perspectives

MPEGCompositor Broadcast VH Signing: Targeted architecture MUXPacket MPEG-4SNHCencoder BAFencoder MPEG-4SNHCdecoder BAFdecoder Multimediaplayer dePacket deMUX EncoderDecoder Compositor SystemSystem Delivery normative proprietary MPEG-2TS MPEG- MPEG-AVencoder 24 AVdecoder 24

Presentation by Streams - WWW  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

Weather Forecast Application  First WWW application: daily weather forecast in 3 sign languages  content creation  example forecast  evaluation

Creation of content  Source: forecast in free text  Tool for semi-automatic conversion  manual standardisation of text  automatic generation sign languages  Result: 3 webpages  English/BSL & Dutch/SLN & German/DGS

Demo

Evaluation with Deaf users  Subjective quality of signing rated as ‘reasonable’ or ‘good’  68% correct or partially correct  Improvement possibilities  mouthing  facial expressions

Mouthing Scores for signs depending in various degrees on mouthing

Facial Expressions Scores for signs depending in various degrees on facial expressions

Next Steps  Improvements  Beta-testing  on line  larger user group  user feedback  Exploitation planning

Presentation by Streams – Face to Face  WP1 Television  Closed signing for Broadcast DTT  Enhanced signing experience  Regulation and Standards  WP2 Internet  Information and Education for Deaf People  WP3 Face to Face  High Street Post Office Counter Services  Science Museum Trial - Summer 2001

WP3: Face-to-face transactions  Research concentrated on TESSA (Text and Sign Support Agent)  Enables Post Office counter clerks to “translate” from (English) speech to sign language  System developments:  Autumn 2000: New system software completed, incorporating IBM “Via Voice” speech recognition and improved avatar  Spring 2001:  200 new signs recorded, processed and added to system  Spring/Summer 2001: Development and testing of “unconstrained system”

First System using Constrained Speech Recognition

“Unconstrained” Speech System

Demo

Testing the Speech Recognition Accuracy of the Unconstrained System  Single speaker  200 “constrained” phrases  Three recording conditions:  studio microphone in acoustic booth  boom microphone I in lab  boom microphone II in Science Museum Post Office  Three conditions for recogniser:  Untrained  Acoustic models fully trained on boom microphone II in lab  Acoustic and language models fully trained

Speech recognition accuracy of unconstrained system

Language Processing I aaboutaccessaccount..youyou’veyour Co-occurrence matrix of Words versus Phrases Phrases 1,2 & 3…... …..Phrase n

Language Processing II  Entry W(i,j) in matrix is transformed to:  Given M words output by recogniser, score for each phrase is computed as: Normalised average uncertainty about phrase p j given word w i Compresses value of entry Scores above a threshold T are displayed to PO clerk in a list

Testing the Phrase Retrieval Accuracy of the Unconstrained System  10 speakers  For each speaker and each of 200 phrases:  record one utterance of the “constrained” phrase  ask speaker to write down another way of expressing the phrase  record speaker saying this phrase  Training of recogniser not possible for 10 different speakers  Hence measure phrase retrieval accuracy on text of unconstrained phrases only

Phrase recognition Results on Text of Alternative Utterances Average accuracy = 73.3%

Future Work  Unconstrained System  Investigate use of partial string matching of word sequences and phoneme sequences  Investigate use of Latent Semantic Analysis  Add spoken language(s) translation  Sign recognition  Collect data  Configure baseline system

Exploitation and Dissemination Highlights  Exploitation and Dissemination  BBC Collaboration for closed signing solution for broadcasting DTV  TESSA BCS IT Award & Gold Medal  WWW Weather Forecasting in 3 European Sign Languages  Close Involvement of Deaf People

Dissemination Highlights  November 2000: TESSA wins British Computer Society Gold Medal for IT  February 2001: TESSA exhibited at Royal Society  March 2001: TESSA appears on “Computer Club” (German TV)  July–September 2001: TESSA on exhibition at Science Museum, London  October 8th 2001: TESSA appears on “Blue Peter” (BBC TV)  November 2001: TESSA on show at COMDEX, Las Vegas

Exploitation Highlights / Short Term  Bandwidth efficient closed signing  Excessive in-vision signing disliked by hearing people  Impacts on DTT multiplexes where bit-rate is already at a premium  BBC investigation of closed signing for DTV  Demonstration of Avatar-based signing  Body suit capture technologies

Short Term- WWW strategy  Give away basic web browser  Sell SiGML authoring tool presented  De facto standard

Exploitation Highlights Medium to Long Term  Conversion of subtitles  high % of programmes subtitled  supports wide range of deaf signing languages  subtitles translated in set top box  overcomes spectrum capacity & scheduling restrictions  Requirements:  reliable unconstrained translator  next generation DVB-compliant STB with in-built signing decoder