Presentation is loading. Please wait.

Presentation is loading. Please wait.

Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007.

Similar presentations


Presentation on theme: "Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007."— Presentation transcript:

1 Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

2 01/22/2007 IT 481, Spring 2Outline Course Description Instructor Exams, Homework and Project Grading General Policies Lecture Schedule

3 01/22/2007 IT 481, Spring 3 Course Description Topics –The fundamentals of signal and image processing, including algorithms for signal processing that have applications to multimedia –Techniques for voice coding and recognition, CD and DVD technology, streaming video, WANs and LANs, and videoconferencing technology Text: Multimedia Communications; Applications, Networks, Protocols and Standards, Fred Halsall, Addison-Wesley; 1st edition (2002), ISBN: 0-201-39818-4.

4 01/22/2007 IT 481, Spring 4Instructor Dennis McCaughey –Contact Information 703-263-7425 (Office) 703-624-6830 (Cell) dgm@rincon.com (e-mail) dgm@rincon.com Office Hours: one hour before class –Background PhD in EE University of Southern California 1977 –Thesis: Degrees of Freedom for Projection Imaging

5 01/22/2007 IT 481, Spring 5 Exams, Homework and Project Mid-Term: 1 Hour Closed Book –Cover the key topics covered in class and homework Final: Format “To Be Determined” Homework: 1) Reading assignments, 2) Written answers to selected questions based on reading assignments, 3) Some limited math problems Project: Format (Preliminary): MATLAB implementations of selected multimedia processing applications.

6 01/22/2007 IT 481, Spring 6 More on the Project A course project will explore aspects of multimedia signal processing and will be computer based using MATLAB. Project topics will consist of a set of Matlab implementations addressing multimedia concepts assigned on a running basis over the semester. Each student will be required to submit the project in the format of a final report. The projects will be graded on the effort applied-not on Matlab programming skills. Details regarding topics, content, and format will be provided during the course.

7 01/22/2007 IT 481, Spring 7Grading The final grade will be determined by a weighted average of the homework assignments, a mid- term exam, a final exam and a project Homework10% Mid-Term20% Project30% Final40%

8 01/22/2007 IT 481, Spring 8 General Policies Collaboration –Students are permitted and encouraged to collaborate on homework assignments. –All graded work, however, must be the original effort of the student submitting the paper. Homework –Homework will be collected at the beginning of each class period. Note: Late homework will be accepted provided the reason for the delay is coordinated with the instructor within 2 days of its assignment. Homework solutions will be discussed in class. Make-up Exams –Make-up exams will not be given unless detailed written clarification accompanied by documentation for the absence is provided. If this information is not provided an F grade will be given for the exam. The location and time for a make-up exam will be decided by the instructor. Also, students are expected to be in class and on-time for every class.

9 01/22/2007 IT 481, Spring 9 Lecture Schedule (Preliminary)

10 Multimedia Communications

11 01/22/2007 IT 481, Spring 11 What is Multimedia? Multimedia is a combination of text, art, sound, animation, and video. Slide: Courtesy, Hung Nguyen

12 01/22/2007 IT 481, Spring 12 Multimedia Components Simplified Multimedia can be viewed as they combination of audio, video, data and how they interact with the user (more than the sum of the individual components) Audio Multimedia Video Data

13 01/22/2007 IT 481, Spring 13Background Fast paced emergence in applications in medicine, education, travel etc Characterized by large documents that must be communicated with short delays Glamorous applications such as distance learning, video teleconferencing Applications that are enhanced by Video are often seen as driver for development of multimedia networks

14 01/22/2007 IT 481, Spring 14 Forces Driving Communications That Facilitate Multimedia Communications Evolution of communications and data networks Increasing availability of almost unlimited bandwidth demand Availability of ubiquitous access to the network Ever increasing amount of memory and computational power Sophisticated terminals Digitization of virtually everything

15 01/22/2007 IT 481, Spring 15 New Information System Paradigm Integration Multimedia Integrated Communication Multimedia Processing Broadband Link Workstation, PC Slide: Courtesy, Hung Nguyen

16 01/22/2007 IT 481, Spring 16 Elements of Multimedia Systems Two key communication modes –Person-to-person –Person-to-machine Transport Use Interface Use Interface Transport Processing Storage and Retrieval Use Interface Slide: Courtesy, Hung Nguyen

17 01/22/2007 IT 481, Spring 17 Multimedia Networks The world has been wrapped in copper and glass fiber and can be viewed as a “hair ball” with physical, wireless and satellite entry/exit points. Physical: LAN-WAN connections Wireless: Cellular telephony, wireless PC connectivity Satellite: INMARSAT, THURYA, ACeS etc

18 01/22/2007 IT 481, Spring 18 Multimedia Communication Model Partitioning of information objects into distinct types, e.g., text, audio, video Standardization of service components per information type Creation of platforms at two levels – network service and multimedia communication Define general applications for multiple use in various multimedia environments Define specific applications, e.g. e- commerce, tele-training, … using building blocks from platform and general applications

19 01/22/2007 IT 481, Spring 19Requirements User Requirements –Fast preparation and presentation –Dynamic control of multimedia applications –Intelligent support to users –Standardization Network Requirements –High speed and variable bit rates –Multiple virtual connections using the same access –Synchronization of different information types –Suitable standardized services along with support

20 01/22/2007 IT 481, Spring 20 Network Requirements ATM-BISDN and SS7 have enabled the switching based communications capabilities over the PSTN that support the necessary services ATM-BISDN-SS7 will evolve to all optical “switchless” networks based on packet transfer

21 01/22/2007 IT 481, Spring 21 Packet Transfer Concept Allows voice, video and data to be dealt with in a common format More flexible than circuit switching which it can emulate while allowing the multiplexing of varied bit rate data streams Dynamic allocation of bandwidth Handle Variable Bit Rate (VBR) directly

22 01/22/2007 IT 481, Spring 22Considerations Buffering required for constant bit rate data such as audio Re-sequencing and recovery capabilities must be provided over networks where packets may be received either in an order different from that transmitted or dropped –In an ATM network some packets can be dropped while others may not (i.e. voice vs bank transfer data packets) –Optimum packet lengths for voice video and data differ in an ATM network –IP packets over the internet may arrive in a different order or be dropped.

23 01/22/2007 IT 481, Spring 23 Digital Video Signal Transport Video Encoder Transformation Quantization Entropy Coding Bit-Rate Control Application Data Structuring Users Network Multiplexing/Routing Overhead (FEC) Re-Trans Error detection Loss detection Error correction Erasure correction Application Re-Synch Decoder De-quantization Entropy decode Inv Trans Loss conceal Post process The following figure will be examined over the course of the semester

24 01/22/2007 IT 481, Spring 24 Quality of Service (QoS) The set of parameters that defines the properties of media streams Can define four QoS layers: 1.User QoS: Perception of the multimedia data at the user interface (“qualitative”) 2.Application QoS: Parameters such as end-to- end delay (“quantitative”) 3.System QoS: Requirements on the communications services derived from the application QoS 4.Network QoS: Parameters such as network load and performance

25 01/22/2007 IT 481, Spring 25 Applications of Multimedia Business - Business applications for multimedia include presentations training, marketing, advertising, product demos, databases, catalogues, instant messaging, and networked communication. Schools - Educational software can be developed to enrich the learning process. Slide: Courtesy, Hung Nguyen

26 01/22/2007 IT 481, Spring 26 Applications of Multimedia Home - Most multimedia projects reach the homes via television sets or monitors with built-in user inputs. Public places - Multimedia will become available at stand-alone terminals or kiosks to provide information and help. Slide: Courtesy, Hung Nguyen

27 01/22/2007 IT 481, Spring 27 Compact Disc Read-Only (CD-ROM) CD-ROM is the most cost-effective distribution medium for multimedia projects. It can contain up to 80 minutes of full-screen video or sound. CD burners are used for reading discs and converting the discs to audio, video, and data formats. Slide: Courtesy, Hung Nguyen

28 01/22/2007 IT 481, Spring 28 Digital Versatile Disc (DVD) Multilayered DVD technology increases the capacity of current optical technology to 18 GB. DVD authoring and integration software is used to create interactive front-end menus for films and games. DVD burners are used for reading discs and converting the disc to audio, video, and data formats. Slide: Courtesy, Hung Nguyen

29 01/22/2007 IT 481, Spring 29 Multimedia Communications Multimedia communications is the delivery of multimedia to the user by electronic or digitally manipulated means. Audio Communications (Telephony, sound, Broadcast) Multimedia Communications Video Communications (Video telephony, TV/HDTV) Data, text, image Communications (Data Transfer, fax…) Slide: Courtesy, Hung Nguyen

30 01/22/2007 IT 481, Spring 30 Multimedia Terms

31 01/22/2007 IT 481, Spring 31 Alternative Types of Media used in Multimedia Applications

32 01/22/2007 IT 481, Spring 32 Multimedia Communications Networks

33 01/22/2007 IT 481, Spring 33 Multimedia Networks and Their Services

34 01/22/2007 IT 481, Spring 34 Multimedia Networks and Their Services

35 Audio-Visual Integration

36 01/22/2007 IT 481, Spring 36 Application in Biometrics – Bimodal Person Verification Existing methods for person verification are mainly based on a single modality which would have limitation in security and robustness Audio visual integration using a camera and microphone makes person verification a more reliable product Slide: Courtesy, Hung Nguyen

37 01/22/2007 IT 481, Spring 37 Joint Audio-Video Coding Correlation between audio and video can be used to achieve more efficient coding –Predictive coding of audio and video information used to construct estimate of current frame (cross-modal redundancy) –Difference between original and estimated signal can be transmitted as parameters –Decision on what and how to send is based on Rate Distortion (R-D) criteria Reconstruction done at receiver according to agreed-upon decoding rules Slide: Courtesy, Hung Nguyen

38 01/22/2007 IT 481, Spring 38 Cross-Model Predictive Coding Visual Analysis A-to-V Mapping Decision Module (R-D) Parameter X Nothing Parameter X Slide: Courtesy, Hung Nguyen

39 01/22/2007 IT 481, Spring 39 Importance of Interaction Multimedia is more than the combination of text, audio, video and data Interaction among media is important Consider a poorly dubbed movie –Audio not synchronized with video –Lip movements inconsistent with language –Audio dynamic range inconsistent with the scene Slide: Courtesy, Hung Nguyen

40 01/22/2007 IT 481, Spring 40 Media Interaction Process and Model Audio Text Image Video Multimedia Lip synch Face Animation Joint A/V Coding Compression Synthesis 3D Sound Sign language Lip reading Speech Recognition Text-to-Speech Compression, Graphics Database indexing/retrieval Translation Natural language Slide: Courtesy, Hung Nguyen

41 01/22/2007 IT 481, Spring 41 Bimodality of Human Speech Human speech is produced by vibration of the vocal cord, configuration of the vocal tract with muscles that generate facial expressions Audio + Visual  Perceived bagada pagata magana Slide: Courtesy, Hung Nguyen

42 01/22/2007 IT 481, Spring 42 Basic Definitions The basic unit of acoustic speech is called a phoneme In the visual domain, the basic unit of mouth movement is called viseme –A viseme is the smallest visibly distinguishable unit of speech –Can contain several phonemes and thus form one viseme group –A many-to-one mapping between phonemes and visemes Slide: Courtesy, Hung Nguyen

43 01/22/2007 IT 481, Spring 43 Lip Reading System Application to support hearing-impaired person People learn to understand spoken language by combining visual content with lexical, syntactic, semantic and programmatic information Automated lip reading systems –Speech recognition possible using only visual information –Integrated with speech recognition systems to improve accuracy Slide: Courtesy, Hung Nguyen

44 01/22/2007 IT 481, Spring 44 Lip Synchronization Applications –In VTC (video teleconferencing) where video frame is dropped (low bandwidth requirement) but audio must still be continuous –In non-real-time use such as dubbing in studio where recorded voice full of background noise Time-warping commonly used in both audio and video modes –Time-frequency analysis –Video time-warping could be used for VTC –Audio time-warping could be used for dubbing Slide: Courtesy, Hung Nguyen

45 01/22/2007 IT 481, Spring 45 Lip Tracking To prevent too much jerkiness in the motion rendering and too much loss in lip synchronization Involved real-time analysis on 3-dimensional of the video signal plus one temporal dimension Produce meaningful parameters –Classification of mouth images into visemes –Measures of dimension, e.g. mouth widths and heights Analysis tools – Fourier Transform, Karhunen- Loeve Transform (KLT), Probability Density Function (pdf) Estimation Slide: Courtesy, Hung Nguyen

46 01/22/2007 IT 481, Spring 46 Audio-to-Visual Mapping for Lip Tracking Conversion of acoustic speech to mouth shape parameters A mapping of phonemes to visemes Could be most precisely implemented with a complete speech recognizer followed by a look-up table –High computational overhead plus table look-up complexity –Do not need to recognize spoken word to achieve audio- to-visual mapping Physical relationships exist between vocal tract shape and sound produced  functional relationships exist between speech and visual parameters Slide: Courtesy, Hung Nguyen

47 01/22/2007 IT 481, Spring 47 Classification-Based Conversion Approaches for Lip Tracking Two-step process –Classification of acoustic signal using VQ (vector quantization), HMM (hidden Markov model) and NN (neural network) –Mapping of the acoustic classes into corresponding visual outputs, then averaged to get centroid Shortcomings –Error resulting from averaging visual vector to get visual centroid –Not a continuous mapping – finite output levels Slide: Courtesy, Hung Nguyen

48 01/22/2007 IT 481, Spring 48 Classification-Based Conversion Phoneme Space Viseme Space Centroid Slide: Courtesy, Hung Nguyen

49 01/22/2007 IT 481, Spring 49 Audio and Visual Integration for Lip Reading Applications Three major steps –Audio-visual pre-processing – Principal Component Analysis (PCA) has been used for feature extraction –Pattern recognition strategy (HMM, NN, time- warping…) –Integration strategy (decision making) Heuristic rules to incorporate knowledge of phonemes about the two modalities Combination of independent evaluation score for each modalities Slide: Courtesy, Hung Nguyen


Download ppt "Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007."

Similar presentations


Ads by Google