MMDB-8 J. Teuhola 2012184 8. Audio databases About digital audio: Advent of digital audio CD in 1983. Order of magnitude improvement in overall sound quality.

Slides:



Advertisements
Similar presentations
Tamara Berg Advanced Multimedia
Advertisements

Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
Using Multimedia on the Web Enhancing a Web Site with Sound, Video, and Applets.
CNIT 132 – Week 9 Multimedia. Working with Multimedia Bandwidth is a measure of the amount of data that can be sent through a communication pipeline each.
4.1Different Audio Attributes 4.2Common Audio File Formats 4.3Balancing between File Size and Audio Quality 4.4Making Audio Elements Fit Our Needs.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Audio Compression
Audio 2 Subject:T0934 / Multimedia Programming Foundation Session:9 Tahun:2009 Versi:1/0.
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
Chapter 5-Sound.
I Power Higher Computing Multimedia technology Audio.
SWE 423: Multimedia Systems Chapter 3: Audio Technology (2)
Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
Chapter 7 End-to-End Data
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
MPEG-3 For Audio Presented by: Chun Lui Sunjeev Sikand.
Howell Istance School of Computing De Montfort University
Spatial and Temporal Data Mining
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
T.Sharon 1 Internet Resources Discovery (IRD) Music IR.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
5. Multimedia Data. 2 Multimedia Data Representation  Digital Audio  Sampling/Digitisation  Compression (Details of Compression algorithms – following.
Chapter 14 Recording and Editing Sound. Getting Started FAQs: − How does audio capability enhance my PC? − How does your PC record, store, and play digital.
Digital Audio Multimedia Systems (Module 1 Lesson 1)
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
COMP Representing Sound in a ComputerSound Course book - pages
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
Media Representations - Audio
Signal Digitization Analog vs Digital Signals An Analog Signal A Digital Signal What type of signal do we encounter in nature?
Multimedia Elements: Sound, Animation, and Video.
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
Multimedia Technology and Applications Chapter 2. Digital Audio
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 9 This presentation © 2004, MacAvon Media Productions Sound.
Sound element Week - 11.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES
Image Processing Architecture, © 2001, 2002, 2003 Oleh TretiakPage 1 ECE-C490 Image Processing Architecture MP-3 Compression Course Review Oleh Tretiak.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
© 2011 The McGraw-Hill Companies, Inc. All rights reserved Chapter 4: Sound.
MPEG-1Standard By Alejandro Mendoza. Introduction The major goal of video compression is to represent a video source with as few bits as possible while.
Digital Audio III. Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent.
CSCI-100 Introduction to Computing Hardware Part II.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
1 What is Multimedia? Multimedia can have a many definitions Multimedia means that computer information can be represented through media types: – Text.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
MP3 and MP4 Audio By: Krunal Tailor
III Digital Audio III.7 (W Nov 04) The MP3 frame format.
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
III Digital Audio III.7 (F Oct 20) The MP3 frame format.
UNIT II.
Sound Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman
CS 4594 Data Communications
III Digital Audio III.7 (Mo Oct 22) The MP3 frame format.

III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
Audio Compression Techniques
Govt. Polytechnic Dhangar(Fatehabad)
Digital Audio Application of Digital Audio - Selected Examples
Presentation transcript:

MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality and signal-to-noise ratio over the best analog systems. Wide bandwidth required in on-line transmission. Converting an analog signal into digital form: Linear Pulse Code Modulation (PCM) Two-stage process: (a) Sampling: Observing the signal amplitude at certain time intervals; typical sampling frequencies: kHz (b) Quantization: discrete scale for observed amplitudes, typically 16 bits per sample  possible values. Audio-CD: 16-bit samples at 44.1 kHz rate, with two (stereo) channels: 2 x 16 x  1.4 Mbits per second

MMDB-8 J. Teuhola Illustration of audio concepts amplitude time wavelength sampling interval

MMDB-8 J. Teuhola Audio compression techniques (a) Delta modulation:  Extremely simple, used sometimes for speech coding  1-bit quantizer for amplitude differences: 0 = - , 1=+  (b) Adaptive Differential Pulse Code Modulation (ADPCM)  The next sample value is predicted on the basis of recent history; the prediction error is quantized and coded  Used mainly for speech coding, e.g. ITU-T G.726 (c) Subband coding Division of the signal into frequency components (bands) Encoding of bands separately E.g. ITU-T recommendation G.722: High-quality speech at 64 Kbits per second

MMDB-8 J. Teuhola MPEG audio Sampling rates 32, 44.1 or 48 kHz (or half of these); samples processed in frames; 384/1152 samples per frame. Subband coding with a bank of 32 filters, each with a bandwidth of 1/64 of the sampling frequency. Samples coded with variable quantization steps. Psychoacoustics uses the masking properties of the human ear Compressed bitrates range from 32 to 224 Kbits per second. Compression factor from 2.7 to 24. MPEG Layer I: best for bitrates > 128 Kbits per sec (per channel). MPEG Layer II: best for bitrates  128 Kbits per sec (per channel). MPEG Layer III: best for bitrates  64 Kbits per sec (per channel) = MP3 music in the Internet (compression  12:1). Discrete Cosine Transform (DCT) on subband signals.

MMDB-8 J. Teuhola Audio data retrieval (a) Based on metadata Additional attributes can be attached to voice data (such as to images and video), e.g. speaker, date, duration, composer, orchestra, instrument,... Attributes can be connected to the whole audio sequence or some parts of it (e.g. parts of a symphony). General document retrieval techniques usually apply.

MMDB-8 J. Teuhola Audio data retrieval (cont.) (b) Speech recognition: Proximity search of the waveform; feature extraction e.g. from coefficients of DCT-transformed signal. Some fuzzyness involved Simple application:  Giving voice commands to a user interface. Advanced application:  Parsing of spoken sentences and conversion e.g. to database queries  Can be coupled with natural language understanding techniques.  Usually based on a predefined set of patterns and associated phonetic rules.

MMDB-8 J. Teuhola Audio data retrieval (cont.) (c) Speaker recognition: Application: security systems. Sensitive to the physical condition (e.g. flu) of the speaker. Variations:  Text-dependent recognition (simpler): Restricted set of possible words/sentences Comparison of digital waveforms.  Text-independent recognition (more difficult): Based e.g. on voice pitch recognition. More elaborate sentences from particular users must be stored, and complex verification algorithms are run against the spoken samples.

MMDB-8 J. Teuhola Audio data retrieval (cont.) (d) Recognition and retrieval of songs (recorded music) Query input alternatives: Query-by-humming: Succeeds for clearly distinguishable melodies (or themes), in spite of small pitch errors. Similarity measure uses some kind of edit distance Tapping the tempo: Complements humming/singing Playing a (virtual) keyboard Output: Ranked list of candidate songs Example search engine: Musipedia (

MMDB-8 J. Teuhola Encoding and retrieval of (synthetic) music Music encoding:  For digital electronic instruments (no singing!)  Timing of note-on/note-off events,  Control of instrument and playback parameters (pitch, loudness)  Can be played with a syntherizer Encoding formats:  MIDI (Musical Instrument Digital Interface)  MPEG-4 SA (Structured Audio) Music XML (Notes represented using structured markup) Retrieval criteria:  Notes: Generalization of string matching (but: polyphony!)  Time-dependent parameters: Instruments, tempo, volume,...  Textual metadata: Title, composer, artist, genre, date,...

MMDB-8 J. Teuhola Indexing of audio data Indexing of metadata (external attributes):  As with any other documents: Inverted indexes, multi- attribute indexes, signature files, etc. Indexing of audio signal:  First split into segments (= frames, windows). Segmentation requires some rules, e.g. ‘quiet’ zones are possibly good split points.  Transformation (e.g. DCT) of each segment into features  A multidimensional index is built from groups of the features (e.g. main DCT coefficients).  Proximity queries (nearest neighbor, or k nearest neighbors of the query sample) should be supported by the index.