Presentation is loading. Please wait.

Presentation is loading. Please wait.

MPEG-4 Structured Audio Eric D. Scheirer Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio) Project Bar-B-Q.

Similar presentations


Presentation on theme: "MPEG-4 Structured Audio Eric D. Scheirer Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio) Project Bar-B-Q."— Presentation transcript:

1

2 MPEG-4 Structured Audio Eric D. Scheirer eds@media.mit.edu Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio) Project Bar-B-Q 1999 Guadalupe River Ranch 15 Oct 1999

3 MPEG-4 Structured Audio, A New Standard for Interactive Sound, in the Creation of Which Tom White did not Run the Whole Show, but Only Played a Small (Though Valuable) Part Eric D. Scheirer eds@media.mit.edu Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio) Project Bar-B-Q 1999 Guadalupe River Ranch 15 Oct 1999

4 What’s this all about? zMPEG-4 is not just about compression zMPEG-4 shows one way for the IA world to move beyond wavetable synthesis

5 Overview zWhat is MPEG? zWhat is MPEG-4 Structured Audio? zWhy was it created? zHow does it work? zHow can it be used in IA applications? zWhat is its current status? zA brief note on MPEG-4 AudioBIFS

6 Intellectual property in MPEG-4 zStructured Audio and AudioBIFS are free All patentable IP has been released to public domain No licensing or other costs to build tools & players (Standard itself costs $300 for printing/bureaucracy) zSA and AudioBIFS are open standards Companies competing through cooperation Interoperability makes the whole pie bigger MPEG processes for improving/correcting standard MIT has no veto over the future of the standard

7 What is MPEG? zMPEG is ISO/IEC JTC1 SC29 WG11 A subcommittee of the Int’l Standards Organization The “Moving Pictures Experts Group” zMPEG-1 : 1993 (ISO 11172) Digital audio/video coding (MP3) zMPEG-2 : 1994-7 (ISO 13818) Digital coding for broadcast zMPEG-4: 1998 (ISO 14496) Object based, synthetic/natural, interactive coding

8 MPEG Marketplace Model MPEG Committee Server-side tools makersClient-side tools makers Content developersContent consumers MPEG Standard Authoring toolsPlayback tools MPEG Content

9 MPEG Marketplace Model MPEG Committee Server-side tools makersClient-side tools makers Content developersContent consumers MPEG Standard Authoring toolsPlayback tools MPEG Content This talk

10 MPEG Marketplace Model MPEG Committee Server-side tools makersClient-side tools makers Content developersContent consumers MPEG Standard Authoring toolsPlayback tools MPEG Content The business opportunities

11 MPEG-4 Audio zHigh-quality sound Based on MPEG-AAC algorithm: twice as good as MP3 zLow-bitrate sound For WWW and cellular: speech/music as low as 4 kbps zSynthetic sound Interface to Text-to-Speech synthesizers High-quality audio synthesis with Structured Audio zAudioBIFS Mix and postproduce multi-track sound streams

12 MPEG-4 Structured Audio zTransmit structured description of sound zUse real-time synthesis to play sound z“PostScript for audio” zBased on new (to MPEG) technology SAOL: New music synthesis language SASL: New music control format zA lot of related technology in academia Csound, Music-11, SynthScript, Nyquist, CLM,...

13 Standardization goals zProvide synthetic sound in MPEG-4 zBring algorithmic synthesis to wider community Standardize academic state-of-the-art; don’t innovate zGet new companies to work on synthesis Implementation required for full MPEG-4 system zSet a higher bar for PC sound architecture Drive forward the world of sound on PCs! Stated goals Secret goals

14 MPEG-4 SA decoding process Reconfigurable Synthesis Engine Reconfigurable Synthesis Engine SAOL Decoder SAOL Decoder SASL/MIDI Decoder SASL/MIDI Decoder Bitstream Bitstream header Multichannel high-quality audio Control parameters Samples

15 What SAOL looks like zA C-like language zBased on the Music-N model zVariables hold audio signals zUnit generators do basic functions zInstruments controlled by score or MIDI instr beep(mp, vol) { asig wave; ksig env; table sig(harm,2048,1,1); wave = oscil(sig,cpsmidi(mp)); env = kline(0,dur*0.05,vol, dur*0.6,vol, dur*0.35,0); output(wave * env); } SAOL: Structured Audio Orchestra Language

16 SAOL capabilities zMany nice features built in Wavetable manipulationFFT/IFFT Multitap delay lines Arrays of signals FIR & IIR filtersEffects routing Granular synthesis3-D audio interface Dynamic layering and triggering zSAOL is extensible-from-within (Allows encapsulation and structured programming) zAny kind of synthesis can be used in SAOL

17 Example z“Xanadu” (Joseph Kung) 60 seconds long, 44 KHz stereo (10.5 MB as WAVE) 2.2 KB in header 4.2 KB in bitstream (= 0.07 kbps) No samples anywhere, only algorithmic synthesis More than 1200:1 “compression”, no loss of quality Could be controlled/restructured interactively

18 MPEG-MMA relationship zMIDI can control MPEG-4 SA synth SASL = more flexible, more tightly coupled zDLS-2 synthesis embedded in SA synth Do wavetable in series or parallel with other techniques z“Wavetable-only” profile of MPEG-4 MIDI + DLS-2 + compressed audio + video (no SAOL) Logical path of progression from today to tomorrow zLots of help from MMA - appreciated! MPEG is ready to help in the other direction (MIDI-DLA?)

19 Applications ideas zMPEG-4 is not an application! It’s a tool - enables functionality and interoperability Implementations could be hardware, software, both Authoring tools also very important zUse MPEG-4 SA like Staccato Synthcore zUse MPEG-4 SA like Beatnik zUse MPEG-4 SA like Koan zUse MPEG-4 SA for new music applications

20 Application example: Gaming MPEG-4 enabled sound card Host program (game) MPEG-4 & MIDI controls Runtime Startup MPEG-4 synthesis/effects algorithms Multichannel, 3-D, post-processed sound MPEG-4 algorithm and sample editors MPEG-4 algorithm marketplace Not just music -- parametric sound effects as well All audio programming and asset development in SAOL No host-language audio programming needed Host APIs (e.g. DirectMusic) can generate controls Embedded MPEG-4 side can do this too, if useful

21 Current status zStandard and reference software finished zMany implementation projects starting Creative Tech Center: Compression & Interactive Audio Studer + EPFL: “ThreeDSpace” project Hobbyist projects (Java API, ActiveX plugin) Others: Be Inc., Sseyo, Kings College, UC Berkeley, Catholic U. Leuven, Q-Team DE, Nokia,... 3 complete implementations already! zA few authoring tools projects zActive mailing list for developers

22 A brief note on AudioBIFS zBIFS is scene-description part of MPEG-4 “Binary Format for Scenes” Based on VRML, but with many new features zAudioBIFS is the audio mixing part Stream audio in multitrack format Deliver mixdown instructions in AudioBIFS Mixing, spatialization, effects in SAOL, multichannel Terminal-adaptive capability Candidate for “PC DSP architecture”?

23 AudioBIFS - scene graph model Audio Source Natural Decoder Synthetic Decoder AudioBIFS manipulation Sound Streaming compressed audio & synthesis controls Decode into raw audio samples Inject sound into scene graph Create sound object with AudioBIFS (mixing, filtering, reverb, etc) Attach sound to main scene (spatially position if desired)

24 Summary zMPEG-4 Structured Audio The international standard for algorithmic sound synthesis zMPEG-4 AudioBIFS The international standard for audio postproduction zNew market opportunities for Hardware/software MPEG-4 players (embedded or not) Authoring tools (editors, sequencers) Advanced interactive audio content

25 What was this all about? zMPEG-4 is not just about compression zMPEG-4 shows one way for the IA world to move beyond wavetable synthesis

26 For more information zMPEG home page http://www.cselt.it/mpeg Requirements, future of MPEG zMPEG-4 SA home page http://sound.media.mit.edu/mpeg4 Draft standard, code, mailing lists, matchmaking zContact eds@media.mit.edu Slides, technical papers, discussion available


Download ppt "MPEG-4 Structured Audio Eric D. Scheirer Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio) Project Bar-B-Q."

Similar presentations


Ads by Google