Presentation on theme: "Audio design for interactive systems Camille Goudeseune Integrated Systems Laboratory, Beckman Institute."— Presentation transcript:
Audio design for interactive systems Camille Goudeseune Integrated Systems Laboratory, Beckman Institute
Outline Sound libraries –music instruments Digital Audio –music theory Roles of sound –composition
high-level multiplatform (linux+windows at least) sound libraries Java Sound API VSS FMOD SDL SEAL Housemarque
Java Sound API Painless for Java. Very basic. Avoid if you're ambitious. http://java.sun.com/products/java- media/sound/
VSS Virtual Sound Server www.isl.uiuc.edu/software/software.html the 500-pound gorilla if you need stuff like 8-channel output or linux/irix/windows compatibility steep learning curve for advanced stuff
FMOD Syzygy (the Cube) uses it for sound. CPU-miserly finite but featureful API www.fmod.org
SDL large user community open source. Good if you want to integrate sound and graphics tightly. www.libsdl.org
SEAL Synthetic Audio Library many C compilers supported Wide support for hardware acceleration www.sonicspot.com/sealsdk/sealsdk.html
Housemarque Multichannel with certain PC soundcards www.s2.org/hmqaudio/
low-level sound libraries: linux OSS, www.linux.org.uk/OSS/ –Most distros include the free basic version –~$30 for fancy multichannel soundcard drivers ALSA, www.alsa-project.org –religiously open-source alternative 0.5.10 is the stable version 0.9.0 development version is fine too (part of the upcoming 2.5 linux kernel).
low-level sound libraries: windows MMIO –simple, universal –waveOutGetDevCaps(), waveOutWrite(),... DirectSound –part of DirectX –LPDIRECTSOUNDBUFFER, CreateSoundBuffer(), Lock(), Unlock(),... –“faster” but more awkward; use a wrapper
Basics of Digital Audio File formats 16-bit or 8-bit? –These days, 8-bit is embarrassing. Pro gear uses 32! Stereo or mono? –Panning mono is faster + simpler than stereo. 8 kHz? 44 kHz?.WAV,.AIFF,.AU,.MP3 converters: “sox” and dozens of others
Basics of Digital Audio Debugging Gaps in sound excessive CPU use stuttering CPU starvation (CPU fast enough but poorly scheduled) Different from graphics! The “frames per second” can’t degrade if the CPU is taxed.
Basics of Digital Audio Debugging Electric-guitar distortion clipping Too quiet: hiss Too loud: clipping Just right: almost clipping –For every stage in the audio pipeline, both software and hardware. Every place you can set the volume level!
Where to get sounds Buy: fx + music libraries Build: record it yourself Build: synthesize it yourself –adjust an existing synth patch, a little or a lot Steal: websearch –8-bit might still suffice while prototyping
What to do with the sounds, once you have them common roles –alerts; acknowledgements; ambience sound vs. image speech vs. non-speech synchronization with visual events combining sounds tips for spatializing
Short and subtle is best! “Graphical excellence gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” (Edward Tufte, The Visual Display of Quantitative Information) This applies to alerts, acks, and ambience. Clutter is worse with sound: no earlids!
Sound or image for a message? Use sound if: Simple Short Standalone Temporal Demands immediate action Use visuals if: Complex Long Referred to later Spatial Can deal with it later
Speech vs. non-speech Main advantage: precise, simple! But... Carries extra connotations Tone of voice, imprecise word choice, can confuse or distract Non-speech doesn’t interfere with conversation
Synchronizing to other things Often 100 msec is accurate enough, when passively observing a sonic and a visual event Sync with input action might need to be tighter, 30 or even only 4 msec.
Moving a probe through a dataset Trigger sounds on button-click, or when crossing thresholds Or, play a continuous sound. Vector-valued data is trickier. Nominal data: one sound per name. Scope of probe: point, ball, shell. –Auto-size based on probe speed. –Granulation
Navigating a VR world Dense world –azimuth: click rate varies with turning rate –altitude: high/low beeps (rate varies with climb rate) –speed: vehicle-engine metaphor Large world: quiet continuous ambiences localized to individual parts of the world. –Play only the nearest two or three.
Combining sounds It can get only so loud before clipping. Spatial separation (“panning”). Temporal separation (one sound at a time). –A party of drunks / a good dinner conversation. Frequency separation. Each layer gets its own tempo. –Heavily layered techno or orchestral music.
Spatialized sound Steady tones are worst. Bird chirps are best (they should know). –wide frequency band, complex attack Loudspeaker array Headphones with HRTF Motion-tracked headphones –In the real world you move your head slightly to tell where a sound comes from.