Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS1315: Introduction to Media Computation Sound Encoding and Manipulation.

Similar presentations


Presentation on theme: "CS1315: Introduction to Media Computation Sound Encoding and Manipulation."— Presentation transcript:

1 CS1315: Introduction to Media Computation Sound Encoding and Manipulation

2 Sound is made when something vibrates The vibration disturbs the air around it and makes changes in air pressure These changes in air pressure move through the air as sound waves The sound waves cause pressure changes against our ear drum sending nerve impulses to our brain

3 Sound waves are pressure waves Each air molecule oscillates back and forth, affecting their neighbors…air molecules themselves don’t travel from vibration source to your ear Creates areas of high and low pressure, called compressions and rarefactions

4 Sound waves are longitudinal Representation of sound by a sine wave is merely an attempt to illustrate the sinusoidal nature of the pressure-time fluctuations

5 Properties of sound waves Pressure fluctuation comes in cycles  frequency of wave is number of cycles per second (cps), or Hertz (Complex sounds have more than one frequency in them.)  amplitude is maximum height of the wave (aka maximum pressure fluctuation) Each repetition of a waveform is called a cycle

6 Volume and pitch: Psychoacoustics, the psychology of sound Our perception of pitch is related (logarithmically) to frequency  Higher frequencies are perceived as higher pitches  A above middle C is 440 Hz Our perception of volume is related (logarithmically) to changes in amplitude  If the amplitude doubles, it’s about a 3 decibel (dB) change

7 Humans and Pitch In general we can hear frequencies between 20 Hz and 20,000 Hz Hz = hertz = cycles per second 20,000 Hz = 20 kHz (kilohertz) The older we get, the less we can perceive really high frequencies – recall those commercials for ‘silent’ ring-tones for teens Many animals can make sounds and hear frequencies that are beyond what we can hear – hence the creation of dog whistles

8 “Logarithmically?” It’s strange, but our hearing works on ratios not differences, e.g., for pitch.  We hear the difference between 200 Hz and 400 Hz, as the same as 500 Hz and 1000 Hz  Similarly, 200 Hz to 600 Hz, and 1000 Hz to 3000 Hz Intensity (volume) is measured as watts per meter squared  A change from 0.1W/m 2 to 0.01 W/m 2, sounds the same to us as 0.001W/m 2 to 0.0001W/m 2

9 Decibel is a logarithmic measure A decibel is a ratio between two intensities: 10 * log 10 (I 1 /I 2 )  As an absolute measure, it’s in comparison to threshold of audibility  0 dB can’t be heard.  Normal speech is 60 dB.  A shout is about 80 dB

10 Digitizing Sound: How do we get that into numbers? Waves are analog (continuous) Remember in calculus, estimating the curve by creating rectangles? We can do the same to estimate the sound curve  Analog-to-digital conversion (ADC) will give us the amplitude at an instant as a number: a sample  How many samples do we need? Remember…pictures are continuous. How do we represent them digitally?

11 Understanding how computers represent sound Consider how a film represents motion…  Movie is made by taking still photos in rapid sequence at a constant rate, usually twenty-four frames per second  When the photos are displayed in sequence at that same rate, it fools us into thinking we are seeing continuous motion, even though we are actually seeing twenty-four discrete images per second http://music.arts.uci.edu/dobrian/digitalaudio.htm

12 http://en.wikipedia.org/wiki/Animation A collection of still frames

13

14

15

16

17

18

19

20

21

22

23

24

25 How computers represent sound Digital recording of sound works on the same principle  We take many discrete samples of the sound wave's instantaneous amplitude, store that information, then later reproduce those amplitudes at the same rate to create the illusion of a continuous wave http://music.arts.uci.edu/dobrian/digitalaudio.htm

26 How often should we take samples? Many many samples must be taken per second-- many more than are necessary for filming a visual image In fact, we need to take more than twice as many samples as the highest frequency we wish to record…enter Nyquist Theorem The number of samples taken per second is known as the sampling rate http://music.arts.uci.edu/dobrian/digitalaudio.htm

27 Nyquist Theorem* (will be on exam) We need twice as many samples as the maximum frequency in order to represent (and recreate, later) the original sound. The number of samples recorded per second is the sampling rate  If we capture 8000 samples per second, the highest frequency we can capture is 4000 Hz That’s how phones work  CD quality is 44,100 samples per second …what is the max frequency a CD can represent?

28 Nyquist Theorem Think back to the example of a film camera, which shoots 24 frames per second  If we're shooting a movie of a car, and the car wheel spins at a rate greater than 12 revolutions per second, it's exceeding half the "sampling rate" of the camera… the wheel completes more than 1/2 revolution per frame.  If, for example it actually completes 18/24 of a revolution per frame, it will appear to be going backward at a rate of 6 revolutions per second  In other words, if we don't witness what happens between samples, a 270-degree revolution of the wheel is indistinguishable from a -90-degree revolution. The samples we obtain in the two cases are precisely the same http://music.arts.uci.edu/dobrian/digitalaudio.htm

29 Nyquist Theorem For audio sampling, the phenomenon is practically identical Consider a graph of a 4,000 Hz cosine wave, being sampled at a rate of 22,050 Hz Sample is taken every 0. 045 milliseconds http://music.arts.uci.edu/dobrian/digitalaudio.htm

30 Nyquist Theorem Consider same 4,000 Hz cosine wave sampled at 6,000 Hz Sample is taken every 0. 167 milliseconds Wave completes more than 1/2 cycle per sample, and the resulting samples are indistinguishable from those that would be obtained from a 2,000 Hz cosine wave http://music.arts.uci.edu/dobrian/digitalaudio.htm

31 Nyquist Theorem The simple lesson to be learned from the Nyquist theorem: digital audio cannot accurately represent any frequency greater than half the sampling rate If we want to record frequencies as high as 20,000 Hz, how many times per second would we need to sample the sound? http://music.arts.uci.edu/dobrian/digitalaudio.htm

32 Digitizing sound with the computer: bit depth Each sample is a numerical value representing the instantaneous amplitude of the signal at the moment it was sampled The range of possible numbers used by a computer depends on the number of binary digits (bits) used to store each number As the number of bits increases, the range of possible numbers they can express increases by a power of two How many bits did we use per color channel per pixel? How many values could each color channel take on? http://music.arts.uci.edu/dobrian/digitalaudio.htm

33 Digitizing sound with the computer Each sample is stored as a number (two bytes=16 bits) What’s the range of available combinations?  16 bits, 2 16 = 65,536  But we want both positive and negative values To indicate compressions and rarefactions.  What if we use one bit to indicate positive (0) or negative (1)?  That leaves us with 16 - 1 = 15 bits  15 bits, 2 15 = 32,768  One of those combinations will stand for zero We’ll use a “positive” one, so that’s one less pattern for positives

34 +/- 32K Each sample can be between -32,768 and 32,767 Compare this to 0..255 for light intensity (i.e. 8 bits or 1 byte giving us 256 different values) Why such a bizarre number? Because 32,768 + 32,767 + 1 = 2 16 i.e. 16 bits, or 2 bytes< 0 > 0 0

35 Sounds as arrays (called lists in jython) Samples are just stored one right after the other in the computer’s memory That’s called an array  It’s an especially efficient (quickly accessed) memory structure (Like pixels in a picture)

36 Working with sounds We’ll use pickAFile and makeSound.  We want.wav files (NOTE: not all.wav files will work) We’ll use getSamples to get all the sample objects out of a sound We can also get the value at any index with getSampleValueAt Can get a sound’s length (getLength(sound)) Can get a sound’s sampling rate (getSamplingRate(sound)) Can save sounds with writeSoundTo(sound,"file.wav")

37 Demonstrating Working with Sound in JES >>> filename = pickAFile() >>> print filename /Users/monica/Desktop/MediaSources/preamble.wav >>> sound = makeSound(filename) >>> print sound Sound of length 421109 >>> samples = getSamples(sound) >>> print samples Samples, length 421109 >>> print getSampleValueAt(sound, 1) 36 >>> print getSampleValueAt(sound, 2) 29

38 Demonstrating working with samples >>> print getLength(sound) 220568 >>> print getSamplingRate(sound) 22050.0 >>> print getSampleValueAt(sound, 220568) 68 >>> print getSampleValueAt(sound, 220570) # note this is too high I wasn't able to do what you wanted. The error java.lang.ArrayIndexOutOfBoundsException has occured Please check line 0 of >>> print getSampleValueAt(sound, 1) 36 >>> setSampleValueAt(sound,1, 12) >>> print getSampleValueAt(sound, 1) 12

39 Working with Samples We can get sample objects out of a sound with getSamples(sound) or getSampleObjectAt(sound, index) A sample object it still part of the sound object, so if you change the sample object, the sound changes. Sample objects understand getSampleValue(sample) and setSampleValue(sample, value)

40 Example: Manipulating Samples >>> soundfile=pickAFile() >>> sound=makeSound(soundfile) >>> sample=getSampleObjectAt(sound, 1) >>> print sample Sample at 1 value at 59 >>> print sound Sound of length 387573 >>> print getSound(sample) Sound of length 387573 >>> print getSampleValue(sample) 59 >>> setSampleValue(sample, 29) >>> print getSampleValue(sample) 29

41 “But there are thousands of these samples!” How do we do something to these samples to manipulate them, when there are thousands of them per second? We use a loop and get the computer to iterate in order to do something to each sample. An example loop that just gets and reassigns the same value: for sample in getSamples(sound): value = getSampleValue(sample) setSampleValue(sample, value)

42 N.B.: function name changes setSampleValue(sample, 200)  used to be called setSample(sample, 200) getSampleValue(sample)  used to be called getSample(sample) So if you are using an old version of JES (pre-3.1) you may have to use setSample and getSample.

43 Properties of sound Frequency Amplitude Wavelength http://music.arts.uci.edu/dobrian/digitalaudio.htm

44 Properties of digital sound Sampling rate/frequency – not to be confused with the frequency of the sound we are trying to capture! Bit depth http://music.arts.uci.edu/dobrian/digitalaudio.htm


Download ppt "CS1315: Introduction to Media Computation Sound Encoding and Manipulation."

Similar presentations


Ads by Google