Does Spatialized Audio Change Everything?

Slides:



Advertisements
Similar presentations
4/1/2017 4:16 PM.
Advertisements

Embedded Streaming Media with GStreamer and BeagleBoard ESC-228 Presented by Santiago Nunez santiago.nunez (at) ridgerun.com.
Pro Tools 7 Session Secrets Chapter 6: After the Bounce or Life Outside of Pro Tools Life Outside of Pro Tools.
4.3. S OUND Use of sound within games. In lecture exploration of answers to frequently asked student questions.
Android 4.0 ICS An Unified UI framework for Tablets and Cell Phones Ashwin. G. Balani, Founder Member, GTUG, Napur.
Live Sound Analog Mixing Console. Live Sound Analog Mixing Consoles Come in many different sizes and configurations for different applications Come in.
Microsense Webcast Streaming Solutions
Java Audio.
Implementing 3D Digital Sound In “Virtual Table Tennis” By Alexander Stevenson.
Binaural Sound Localization and Filtering By: Dan Hauer Advisor: Dr. Brian D. Huggins 6 December 2005.
Video for Mobile Device Mark Green School of Creative Media.
Using Sound in Games Alex Baumann Outline 3D Spatialization Getting and Editing Sounds Using Sounds in Games Music in Games Example Videos.
MSS & AMS Name and configure MIDI controllers, devices and sound modules. Control their routing to Pro Tools MSS - Configurations can be imported or exported.
Audio Post Production Workflow Part 2. Mixing Dubbing (aka) Sweetening The Process of mixing and re-recording of individual tracks created by the editorial.
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
ADVANCED RADIO PRODUCTION Books: “Modern Radio Production” by Hausman, Benoit, Messere, & O’Donnell: Chapter 15 Pertemuan 12 Matakuliah: O Dasar-Dasar.
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
Live Sound Reinforcement
Embedded Streaming Media with GStreamer and BeagleBoard
CLUE Framework Status and Issues IETF89 - London March 5, 2014 Mark Duckworth draft-ietf-clue-framework-14 1.
CIS 102Introduction to Hardware and Software Chapter 2 Input and Output part 3 output devices.
11 Adding Sounds Session 7.1. Session Overview  Find out how to capture and manipulate sound on a Windows PC  Show how sound is managed as an item of.
Panning and Filters © Allan C. Milne Abertay University v
Games Development Practices Sound Effects & OpenAL CO2301 Games Development 1 Week 15.
Inspire School of Arts and Science Jim White. What is Reverb? Reverb or ‘reverberation’ is not simply just an effect which makes vocals sound nice! It.
Digital Recording. Digital recording is different from analog in that it doesn’t operate in a continuous way; it breaks a continuously varying waveform.
Lonce Wyse Arts and Creativity Lab National University of Singapore Net-Music 2013: The Internet as Creative Resource in Music.
Video Games & Object Oriented Programming. Games.
New stuff People were interested in more detailed spatial information about media captures Added area of capture and point of capture attributes Also addresses.
Choosing the right recording tools for the job Hazzan Rob Menes May, 2015.
HRTF 8 AudioGraph Spatial audio source Can be input source or submix Support #emitters depends on multiple factors Orientation, Position,
Design of a Guitar Tab Player in MATLAB Summary Lecture Module 1: Modeling a Guitar Signal.
Mixed Reality Augmented Reality (AR) Augmented Virtuality (AV)
University of Maryland Baltimore County
Android Mobile Application Development
The Auro-11.1 Listening Format
3.3 Fundamentals of data representation
Objective % Explain concepts used to create digital audio.
Fantasound Developed by Disney for “Fantasia” (1940)
Signal Output Send effect Send/Return effects Definition:
Subject Name: File Structures
Introducing OpenShot Library
Understanding Mixing Desks.
Boomerang Adds Smart Calendar Assistant and Reminders to Office 365 That Increase Productivity and Simplify Meeting Scheduling OFFICE 365 APP BUILDER.
SOFTWARE DESIGN AND ARCHITECTURE
Fairmont Park Baptist Church Technical Support Ministry
Google Cardboard and VR
Spatial Audio - Spatial Sphere Demo Explained
Dystopia game Amjd , Iyad , Haytham.
Directions: GO THROUGH THE FOLLWING SLIDES. Make sure you have quizlet cards for all the vocabulary. Study the terms.
Job roles in the creative media industry- sounds
Objective % Explain concepts used to create digital audio.
Multimedia: making it Work
Directions: GO THROUGH THE FOLLWING SLIDES. Make sure you have quizlet cards for all the vocabulary. Study the terms GCFLearnFree website “Computer Basics”:
Cryptography This week we are going to use OpenSSL
Advanced Topics in Data Management
High Throughput Route Selection in Multi-Rate Ad Hoc Wireless Networks
Technology ICT Option: Audio.
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Games Development Practices Sound Effects
A Mobile Application for the Blind and Dyslexic
Technology ICT Option: Audio.
Huffman Encoding.
Digital Literacy 1.00 Computer Basics
Recap In previous lessons we have looked at how numbers can be stored as binary. We have also seen how images are stored as binary. This lesson we are.
Registers Today we’ll see some common sequential devices: counters and registers. They’re good examples of sequential analysis and design. They are also.
Use of sound within games
Objective Explain concepts used to create digital audio.
Microsoft 365 Business Technical Fundamentals Series
Presentation transcript:

Does Spatialized Audio Change Everything? Now I’ve been at Unity almost 2 years. I’m focused mostly on VR and AR audio and video. By video, I mean video decoders. So what does that mean? For audio, it basically means focusing on adding more realistic capabilities to Unity audio. We have 3 audio engineers in Copenhagen focused on core audio. And I’m here in Bellevue focused on things like HRTF and the frameworks to support HRTF (spatializers), ambisonics, and environmental audio. That sounds exciting, but so far my role has really been to get audio working on some new VR and AR platforms and to create frameworks to support this stuff. Or to make sure the Unity framework does not prohibit doing these things. On the video side, I’m making sure our Unity video player works efficiently on new XR platforms. Making sure it works with spatialized audio, making sure it works in AR with the re-constructed meshes captured from the room you’re in. Today, I want to focus on the XR audio stuff I’ve been thinking about, mostly the framework or audio engine around HRTF nodes (or spatializers). “Does Spatialized Audio Change Everything?” I want to explore that with you all a little bit today.

What is a traditional audio engine? Tree of nodes Node performs an operation Buffers of data flow throw these nodes Metadata describes buffers Audio source nodes work on mono data Audio mixer nodes work on stereo data (in XR)

What are some audio nodes? File nodes stream in data De-compression nodes produce raw PCM data Low-pass filter nodes reduce high frequencies High-pass filter nodes reduce low frequencies Mixer nodes add all input buffers 3D panner node outputs stereo based on 3D position

What is a spatializer node? Realistic 3D panner node Left channel is sent data for left ear Right channel is sent data for right ear More subtle than traditional 3D panner node

What is unique about a spatializer node? Fundamental for 3D sounds (for XR and headphones) Expensive

Other fundamental, expensive nodes? De-compression nodes Project-specific nodes Nothing else is so fundamental, necessary, and expensive

What’s the problem? Spatializer nodes fit well into the traditional audio engine Problem is performance

What’s the node performance? Performance numbers taken on Samsung S8, Android phone Most nodes are relatively cheap De-compression nodes cost 0.75% CPU per sound (Vorbis) Spatializer nodes cost 1.0% CPU per sound (Oculus)

What is game audio’s budget? 1ms on main thread (30fps game) 0.5ms on main thread (60fps XR game) 5-10% of device’s memory and CPU resources 50% of core on audio thread

What’s the cost of de-compressed, spatialized sounds? 16 sounds cost 1.75% x 16 = 28% 28 sounds cost 1.75% x 28 = 49% 32 sounds cost 1.75% x 32 = 56% 64 sounds cost 1.75% x 64 = 112%

What can we do with mobile XR? We can support 28 sounds, if we do nothing else But we also want: Occlusion and low-pass filtering Play requests Reverb Event systems 64 sounds for AAA audio

Optimization: Prioritize Prioritize sounds Separate virtual and physical sounds Spatialize most important physical sounds 32 physical sounds, 16 spatialized sounds cost 40%

Optimization: Cheap spatialization Use realistic distance attenuation curves Perform ILT Perform ILD Perform simple, low-pass filtering based on location

Optimization: Group nearby sounds Group sounds at the same location together Spatialize the mixed group of sounds Group far away sounds more aggressively Character sounds play from the mouth and foot nearby Character sounds play from the center far away

Optimization: De-compress less Don’t repeatedly de-compress small, frequently played sounds De-compress once Use more memory and less CPU

Optimizations: All of the above Play 32 physical sounds De-compress 24 sounds Spatialize 8 locations (16 sounds) Use low-LOD spatialization on other sounds 26% CPU usage

Is there a different approach? Google Resonance Windows Sonic Dolby Atmos

Google Resonance Convert each sound into ambisonic format Mix ambisonic sounds Decode one, mixed ambisonic sound and spatialize in one step Scales very well Each sound costs 0.75% to de-compress (same) Each sound costs 0.6% to spatialize (instead of 1.0%) 32 de-compressed, spatialized sounds cost 43%.

Google Resonance limitations Mixer part of audio engine does not exist Or mixers are in 16-channel ambisonic format

Negatives to eliminating Audio Mixers? Sound designer workflow (“Lower the player’s foley.”) Apply expensive effects once to many sounds HDR and side-chaining (activity on mixer affects other sound/mixer properties)

Positives to eliminating Audio Mixers? Very simple pipeline Easy, flexible jobification Low-latency because of few dependencies

My personal thoughts (ambisonic) Initially, I was very intrigued with the ambisonic / Google approach But, the performance improvement is limited for 32 sounds / mobile Not good enough to throw away the traditional audio engine design

My personal thoughts (traditional) Need excellent prioritization / culling algorithm Need lower-quality spatializer to pair with high-quality spatializer Optimizations feel more like traditional game engine / audio opts We can do this!

What do you think?