Presentation on theme: "Multichannel solutions for sound enhancement and acoustic conditions in concert halls and operas David Griesinger Lexicon"— Presentation transcript:
Multichannel solutions for sound enhancement and acoustic conditions in concert halls and operas David Griesinger Lexicon
Major Goals To explain and demonstrate the degree to which the acoustics of halls and operas may not be the same as the sound in these spaces. –The dependence of acoustics on visual aspects of architecture and on the expectations of the listeners may be underappreciated. To show how physics and psycho acoustics combine to produce absolute standards of acoustic quality for sound in opera houses and concert halls. –To suggest that sonic distance – the perceived audible distance between a performer and a listener – is the major descriptor of this acoustic quality in an opera house. To explain and demonstrate how electronic acoustic enhancement can be used to achieve higher sonic quality in some halls. To play as many musical examples as possible – using multichannel discrete surround and two channel to five channel conversion.
What constitutes good sound? Leo Beranek [JASA 107 pp Jan. 2000] – rank ordered houses by asking conductors to fill out a questionnaire. Semperoper Dresden is ranked nearly at the top, as is the Teatro alla Scalla. But the SOUND of these two theaters is extremely different. Semperoper is highly reverberant, and La Scalla is highly damped. In practice the remembered sound of an opera house can depend strongly on non-sonic factors.
High-Definition Demo Brahms F minor Piano Quintet –Performed by the faculty of the Point-Counter-Point Summer camp. –Video is high-definition (with some artifacts.) –Audio is two channel, single microphone pick-up. –Played here (after post production) with two- channel to five-channel processing.
Why is there so much confusion? 1. Research methods based on questionnaires suffer from a fundamental properties of acoustic perception: – The supression of acoustic perception after a short time period. –The inability to accurately remember the sound quality. 2. We might be asking the wrong questions to the wrong people –The conductor is only one of the many people who work to present opera to the public –For most of these people the music is secondary to the drama. Their job is to get the story and the emotion to the audience. –To most people involved in opera production the Clarity of the singers and the balance between them and the orchestra is of the utmost importance.
Measurement methods for halls and operas are inadequate and often misleading Sabines reverberation time is useful, but it is the combination of reverberation time and reverberation level that we perceive. Jordans EDT measure was intended to measure the direct/reverberant ratio. –But EDT is based on the decay of very long sounds, and does not measure the hall response to short sounds. Schroeders method of measuring EDT (which is now an international standard) gives results that are independent of the direct/reverberant ratio. –Schroeder misunderstood the purpose of the measure. –His method yields results essentially identical to the reverberation time. C80 and related measures use 80ms as a division point between early and late. –But in fact human perception utilizes THREE time regions: 0-50ms, ms, and 150ms+. Intelligibility correlates best with C50, not C80, and reverberance correlates best with the ratio between the energy from 0-50ms to the energy 150ms and greater.
It is difficult to remember the sound of acoustics Human physiology suppresses acoustic perception. –After 5 to 10 minutes in a particular space we lose the ability to perceive its acoustic properties. –Work by Shin-Cunningham suggests that the process of extracting speech information from acoustic interference is adaptive. We adapt to a particular situation in 5 to 10 minutes, and the adaptation is unconscious. After the adaptation period the perception of muddiness (mulmig or glauque) becomes difficult to perceive and to remember. As a consequence, it is difficult to remember the properties of an acoustic space, particularly for speech. –Unless intelligibility is seriously compromised. We need to compare acoustic sounds BEFORE our physiology adapts to them. –We need relatively rapid A/B comparisons to accurately rank acoustic quality.
Boston Cantata Singers in Jordan Hall
Cantata Singers Rakes Progress Performance in Jordan Hall, January 26, Reverberation time in Jordan ~1.4 seconds at 1000Hz. This is similar to the Semperoper Dresden. The typical audience member is ~ 3 reverb radii from this singer. (reverb 10dB stronger than direct) The dramatic consequences are highly audible. It is amazing that in spite of the enormous acoustic distance, the performers still manage to project emotion to the listener. The performance received fabulous reviews. But the situation is not ideal. One reviewer commented on the regrettable lack of surtitles. The opera is in English.
Distance in Jordan Hall Reverberation time (full) measured as ~1.4 seconds at 1000Hz. Reverberation radius ~ 10 feet inside the stage house, ~14 feet in the hall. Thus a typical listener will be ~ 3 reverberation radii away from a singer who is fully upstage. This implies a direct/reflected ratio of –10dB. Jordan Hall is not renowned as an opera venue – perhaps we are hearing why. But the size and reverberation time are almost identical to the Semperoper Dresden, which is currently regarded as one of the best!
Binaural Recordings Manfred Schröder suggested that Binaural recordings could be used to compare different concert halls in the laboratory. The method has many difficulties –Matching of pinnae shape of the microphone to the listener. –Matching of the playback equipment to the listener. –These difficulties are particularly acute in studying concert hall acoustics. –Schröder suggested the use of a cross-talk canceller to solve some of these problems. However, in our experience the differences between opera houses are so large that relatively simple recording and playback equipment can capture the essential aspects of the sound. –And that these differences can easily be heard even with loudspeaker playback.
Glasses microphones dual lavaliere microphones from Radio Shack plug directly into a mini- disk recorder. The result is free of diffraction from the pinnae of the person making the recording, which is an advantage. When combined with a calibrated pair of headphones, this system reproduces sonic distance, intelligibility, and envelopment quite well.
Binaural Examples in Opera Houses It is very difficult to study opera acoustics, as the sound changes drastically depending on: 1.the set design, 2.the position of the singers (actors), 3.the presence of the audience, and 4.the presence of the orchestra. Binaural recordings made during performances can give us important clues. Here is a short example from the Semper Oper Dresden. This hall was rebuilt in 1983, and considerable effort was expended to increase the reverberation time. The RT is over 1.5 seconds at 1000Hz, which implies a reverberation radius of under 14. This hall is ranked nearly the best in Leos survey. Note the excessive distance of the singers, and the low intelligibility
Staatsoper unter den Linden Berlin The Staatsoper Berlin is similar in size to the Semperoper, and the acoustics in Berlin are probably much closer to the original acoustics in Dresden RT at 1000Hz ~0.9s (without LARES). With LARES the RT at 1000Hz is ~1.1s, but the RT is ~1.7s at 200Hz. Here is a recording made from the parquet, about 2/3s of the way to the back wall. Although this hall does not even appear in Leos survey, it is currently by far the most vital of the Berlin Opera houses.
Deutsche Oper, Berlin In spite of the impressive wood paneling, the sound in this hall is rated between pretty poor and gastly by the people I interviewed during a site visit. It is perhaps significant that this hall is moribund. They are searching for both a new music director and a new general manager. Concerning the acoustics, I was told that they are just waiting for the architect to die, so they can re-design it. But how should it be redesigned? Just what is wrong with it as it is?
Bolshoi The old Bolshoi in Moscow is similar in design to the Staatsoper but larger. The recording was made from the back of the second ring, and is monaural. RT ~ 1.1 seconds at 1000Hz, rising at low frequencies. In my opinion the sound in this hall is extremely good. The dramatic impact of the singers is phenomenal for such a large hall, and envelopment in the parquet is high. This theater is extremely popular – nearly impossible to get into without paying a scalper ~$100.
New Bolshoi The New Bolshoi is very similar to the Semperoper Dresden. The Semperoper was the primary model for the design. RT ~1.3 seconds at 1000Hz. What is it about the SOUND of this theater that makes communication with the singers so difficult? The general manager views this theater as unsuccessful acoustically. There have been many complaints – the singers are both too loud and too hard to hear. This theater suffers greatly from having the old Bolshoi next door!
The Sound of Opera – the blind opera fan. What distinguishes the SOUND of the New Bolshoi from the Staatsoper Berlin, or the Royal Theater, Copenhagen? –Reverberation time? –Intelligibility? –Envelopment? –Balance? –All might be involved An informal poll of acousticians gave the result that EVERY ONE thought 1.5 seconds was the ideal reverberation time. –And yet the two Bolshoi theaters dramatically contradict this idea. Intelligibility in ALL the theaters I have visited is satisfactory. Here is dialog from the Semperoper: Envelopment in the parquet of the old Bolshoi is high, even with a low reverberation time. Here is a segment from Gisielle Balance IS important – but it is not sufficient to explain the differences we hear.
Balance between the orchestra and the soloists Reverberation time affects balance, due to the directional properties of the human voice. Note that the loudness of the orchestra increases about 1.5dB as RT rises from 1s to 1.5s. This rise is not sufficient to explain the large dramatic differences between Semperoper Dresden and Staatsoper Berlin.
Sonic Distance Even casual listening to the examples in this paper reveals that the most obvious difference is how far away the voices seem. Loudness is a primary distance cue. –This distance cue can be overcome by trained actors and singers, who know how to project their voices with sufficient energy. –If you have the money you can hire singers with more vocal power. The main secondary cue for distance is the ratio between the loudness of the direct sound and reflected energy that arrives between 50 and 150ms after the direct sound. When this energy is excessive the singers can sound loud, but muddled and far away. –Dramatic connection between the actors and the audience suffers.
Human sound perception – Separation of the sound field into foreground streams. Acousticians are entranced with reflections – rather arbitrarily divided into early and late. –But human perception works differently. –Human brains evolved to understand speech, and to ignore reflections. Third-octave filtered speech. Blue 500Hz. Red 800Hz Speech consists of a series of foreground sound events separated by periods of relative silence, in which the background sound can be heard.
One of the most important preliminary functions of human hearing is stream formation Foreground sound events (phones or notes) must be separated from a total sound field containing both foreground and background sounds (reverberation, noise). Foreground events are then assembled into streams of common direction and/or timbre. A set of events from a single source becomes a sound stream, or a sound object. A stream consists of many sound events. –Meaning is assigned to the stream through higher level neural functions, including phoneme recognition and the combination of phonemes into words. Stream separation is essential for understanding speech –When the separation of sound streams from noise is easy, intelligibility is high. Separation is degraded by noise and reverberation. This degradation can be measured by computer analysis of binaural speech recordings. Stream formation is entirely sub-conscious. –We can consciously choose which stream listen to, but we can not influence the separation process.
Separation of binaural speech through analysis of amplitude modulations Reverb forward Reverb backward Analysis into 1/3 octave bands, followed by envelope detection. Green = envelope Yellow = edge detection By counting edges above a certain threshold we can reliably count syllables in reverberant speech. This process yields a measure of intelligibility – not distance.
Analysis of binaural speech We can then plot the syllable onsets as a function of frequency and time, and count them. Reverberation forwardReverberation backwards Note many syllables are detected (~30) Notice hardly ANY are detected (~2) RASTI will give an identical value for both cases!!
How do we perceive distance and space? Reflected energy interferes with itself at the listeners ears, producing fluctuations in the sound pressure. We perceive fluctuations in level during a sound event and up to 150ms after the end of the sound as a sense of distance from the sound source. If the reflections are spatially diffuse (from all directions) the fluctuations will be different in each ear. –Fluctuations that occur during the sound event and within 50ms after the end of the event produce both a sense of distance and the perception of a space around the source. This is Early Spatial Impression (ESI) The listener is outside the space – and the sound is not enveloping But the sense of distance is natural and pleasant. –Spatially diffuse reflections later than 50ms after the direct sound produce a sense of space around the listener. This can be perceived as envelopment. (Umgebung)
The downside of Distance Perception Reflections during the sound event and up to 150ms after it ends create the perception of distance But there is a price to pay: –Reflections from 10-50ms do not impair intelligibility. The fluctuations they produce are perceived as an acoustic halo or airaround the original sound stream. (ESI) –Reflections from ms contribute to the perception of distance – but they degrade both timbre and intelligibility, producing the perception of sonic MUD. (Mulmig,Glauque) The addition of mud to a speech or singing voice has serious dramatic consequences
Distance and Drama: Copenhagen New Stage We were asked to improve speech intelligibility in this theater, specifically for drama. Using some extraordinary technology we succeeded. But we also increased the sense of sonic distance. The theater directors decided to fix the intelligibility problems by improving the diction of the actors. We completely agreed!
Example of reflections in the ms range Balloon burst in the New Bolshoi. Source was on the forestage, and the receiver was in the stalls at row 10. Note the HUGE burst of energy about 50ms after the direct sound. The 1000Hz 0ctave band shows the combined reflections to be 6dB stronger than the direct sound. The sound clip shows the result of this impulse response on speech. The result (in this case) is a decrease in intelligibility and an increase in distance
Human Perception – the background sound stream We perceive the background sound stream in the spaces between the individual sound. The background stream is perceived as continuous, even though it may be rapidly fluctuating. When masking by foreground sounds is low the background stream is perceived at an absolute level, not as a ratio to the foreground sound. –This is why playing a recording at a higher level cause the perceived amount of reverberation to increase. Perception of the background stream is inhibited for 50ms after the end of a sound event, and reaches full sensitivity only after 150ms.
Example of foreground/background perception (as a cooledit mix) Series of tone bursts (with a slight vibrato) increasing in level by 6dB Reverberation at constant level Mix with direct increasing 6dB Result: backgound tone seems continuous and at constant level
Example of background loudness as a function of Reverberation Time Tone bursts at constant level, mixed with reverberation switching from 0.7s RT to 2.0s RT, and reducing in level ~8dB Output – perceived background is constant! (But the first half is perceived as farther away!) Note the reverb level in the mix is the same at 150ms and greater. One gets the same results with speech.
Summary: Perceptions relating to stream separation First is the creation of the foreground stream itself. The major perception is intelligibility Second is the formation of the background sound stream from sounds which occur mostly 150ms after the direct sound ends. The perception is reverberance Third is the perception of Early Spatial Impression (ESI) from reflections arriving between 10-15ms and 50ms after the end of the direct sound. The perception is of distance and acoustic space around the source. Fourth is the timbre alteration and reduction of intelligibility due to reflections from 50 to 150ms after the end of the direct sound event. The perception is MUD and distance. Human hearing has been designed to suppress the perception of ESI and of mud. As long as intelligibility is more or less satisfactory, after an adaptation period we no longer hear these properties of the room. –And we usually can not remember them. –This does NOT mean they are dramatically or artistically unimportant!
Synthetic Opera House Study We can use MC12 Logic 7 to separate the orchestra from the singers on commercial recordings, and test different theories of balance and reverberation. From Elektra – Barenboim. Balanceand reverb in original is OK by Barenboim. Original Mix Vocals Downmix – with reverb on the orchestra, but not on the singers Reverb from orchestra Reverb from singers Downmix with reverb on the singers. Note the result is MUDDY! DresdenBerlin
Localization Localization is related to stream formation. It depends strongly on the onset of sound events. –IF the rise-time of the sound event is more rapid than the rise-time of the reverberation: –then during the rise time the IID (Interaural Intensity Difference) and the ITD (Interaural Time Difference) are unaffected by reflections. We can detect the direction of the sound source during this brief interval. Once detected, the brain HOLDS the detected direction during the reverberant part of the sound. And gives up the assigned direction very reluctantly. –The conversion between IID and ITD and the perceived direction is simple in natural hearing, but complex (and unnatural) when sound is panned between two loudspeakers. Sound panning only works because localization detection is both robust and resistant to change. A sound panned between two loudspeakers is profoundly unnatural.
Detection of lateral direction through Interaural Cross Correlation (IACC) Start with binaurally recorded speech from an opera house, approximately 10 meters from the live source. We can decompose the waveform into 1/3 octave bands and look at level and IACC as a function of frequency and time. Level ( x = time in ms y=1/3 octave bands 640Hz to 4kHz) IACC Notice that there is NO information in the IACC below 1000Hz!
Position determination by IACC We can make a histogram of the time offset between the ears during periods of high IACC. For the segment of natural speech in the previous slide, it is clear that localization is possible – but somewhat difficult.
Position determination by IACC (continued) We can duplicate the sound of the previous example by adding reverberation to dry speech, and giving it a 5 sample time offset to localize it to the right. As can be seen in the picture, the direct sound is stronger in the simulation than in the original, and the IACCs - plotted as 10*log10(1-(1/IACC)) - are stronger. Level displayed in 1/3 octave bands (640Hz to 4kHz) IACC in 1/3 octave bands
Position determination by IACC (continued) Not surprisingly, due to the higher direct sound level and the artificially stable source the lateral direction of the synthetic example is extremely clear and sharply defined. Histogram of the time offset in samples for each of the IACC peaks detected, using the synthetically constructed speech signal in slide 2.
Summary so far… Rank ordering opera houses or concert halls through the memory of conductors is probably not very useful. When the sounds of a house can be compared rapidly (through electronic enhancement or recording) there is almost unanimous agreement on the best sound, and this sound is highly articulate. The conductor will insist on some low-frequency envelopment on the orchestra, as long as vocal clarity is not compromised. Considerable experimentation has found that there is an ideal reverberation profile for opera performances. –This profile is based on the physiological properties of human hearing –And is thus the same profile as we need on a good recording.
The Ideal Reverberation above 1000Hz. The ideal profile has three distinct slopes. 1.Reflections in the 20ms to 50ms time range with a total energy of -4dB to -6dB relative to the direct sound combine with the direct sound to produce a decay rate under 1 second RT. 2. Reflections in the 50ms to 150ms time range decay much more gradually – with a slope greater than 2 seconds RT. 3. Reflections after 150ms produce our perception of reverberance, and should decay at a rate appropriate to the music. Aside – this profile is a bit of a theoretical concept. Measurement data in halls is sufficiently chaotic and place dependent to prevent one from actually observing a triple slope !
Most real rooms (at all frequencies) have exponential decay Exponential decay produces a single-slope. If the direct sound is strong enough the effective early decay can be short. - But then there will be too few early reflections and the late reverberation will be weak. If the direct sound is weak, there will be too much energy between 50 and 150ms, and the sound will be MUDDY.
The ideal reverberation profile is frequency dependent For frequencies above 1kHz (speech) the ideal profile has three distinct slopes –1. The early slope – consisting of the direct sound and the 0-50ms reflections. This slope is steeply down – less than 1 sec RT. –2. The middle slope – 50 to 150ms – is relatively flat – can have an RT of 3s or more. This flat section of the profile maximizes the late reverberant level while minimizing the muddiness. –3. The slope of the decay beyond 150ms can be around 1.3 seconds RT for opera and up to 2 seconds RT for orchestra (if the early slope is short enough to maintain clarity.) Below 500Hz the decay probably should be single sloped, with RT of 1.7s or higher. –This is because in our experience a single slope decay at low frequencies produces the most pleasing sound on an orchestra. Thus in a hall with natural acoustics the reverberation time and reverberation level should increase below 500Hz.
Theatro Alla Scala, Milan Echograms from LaScala. (From Beranek) illustrate these profiles: Top curve - 2kHz octave band, 0-200ms At 2kHz note the high direct sound and low level of reflections in the ms time range. Bottom curve - 500Hz octave band 0-200ms Note the high reverberation level – and short critical distance.
Lets listen to Alla Scala! Matlab can be used to read these printed impulse respones and convert them into real impulse responses. –1. First we read the.bmp file from a scan, and convert the peaks in the file to delta functions with identical time delay, and an amplitude equivalent to the peak height. All the direct sound energy is combined into a single delta function, and the level of the direct sound is normalized (relative to the rest of the decay), so the 2kHz and 500kHz impulses can be accurately combined. –2. We then apply a random variable ~+- 5ms to the delay time to correct for the quantization in the scan. –3. We then extend the echogram to higher times by tacking on an exponentially decaying segment of white noise, with a decay rate equal to the published data for the hall. –4. We then filter the result for the 2kHz echogram with a 1k high-pass filter, and combine it with the 500Hz echogram low-pass filtered at 1kHz. –5. If desired we can create a right channel and a left channel reverberation by using a different set of random variables in steps 2 and 3. –6. We convolve a segment of dry sound with the new –The result is sonically quite convincing!
Alla Scala at 500Hz – reading the plot Top curve – 500Hz measured impulse response as given by Beranek. JASA Vol. 107 #1, Jan 2000, pp Bottom curve – impulse response as regenerated from delta functions, passed through a 2kHz 6 th order 1 octave filter. Note the correspondence is more than plausable.
Alla Scala 500Hz – randomizing and extending Top graph: Alla Scala published data Bottom graph: regenerated impulse response after randomization and extention.
Listen to Alla Scala, NNT Tokyo, Semperoper 2kHz and 500Hz Impulse responses from Scala Milan NNT Theater Tokyo Semper Oper Dresden (All data from Beranek) Original Sound 2kHz500Hz
How can we make a room ideal for opera? A conventional opera house can be made to approach the sonic ideal by MAXIMIZING the reverb radius for the soloists, for frequencies above 700Hz. –This involves arranging the audience and reflectors around the stage to direct the sound of the singers directly into the audience. –These architectural features increase the very early energy while decreasing the sound power available to the middle and late reverberation. –At the same time, we should try to maximize the reverberation time below 500Hz. To some degree, the success of a design can be seen immediately in a picture taken from the stage. –We need only notice how much absorption we see in front of us. The more absorption and less bare wall we see, the higher the clarity.
Pictures from the stage Deutsche Oper – might as well tear it down. New Bolshoi – just add curtains on the back wall. Deutsche Staatsoper – vital, exciting, and alive – with or without the LARES.
Compromises The fight between those who like clarity and those who like reverberance is relatively recent. –Reveberance currently has the upper hand. –One of the purposes of this talk is to suggest that the emphasis on reverberance is misguided. –In every case where the author has worked closely with a music director, the director has wanted a more reverberant sound. like the Semperoper However, when given the opportunity to hear what Semperoper reverberation actually sounds like, the director invariably prefers a much less reverberant sound. In fact, it is my observation that the difference between the reverberance the conductor wants, and the natural reveberance of a dry opera house is extremely subtle. In a controlled test at the Royal Theater in Copenhagen (set up by Anders Gade) 80% of the test subjects could hear no difference at all. –In every case where we have had the opportunity to increase clarity, or improve the balance between the singers and the orchestra, the improvement has been noticed immediately, and appreciated, by everyone, including the conductor.
Ideal sound through electronics Electronic enhancement has the potential to create ideal opera acoustics –But only if the system is capable of creating a triple-slope decay at high frequencies, and a single-slope decay at low frequencies. –This combination is not common with currently available systems!
Acoustic Feedback – bane or boon? All enhancement systems have significant feedback between the loudspeakers and the microphones. –A single slope decay with an RT of 1.7 seconds MUST create a reverberation radius which is relatively small – usually under four meters in a typical opera house. –If the pickup microphones are separated from each sound source by more than this distance, they MUST pick up more reverberation than direct sound. Current enhancement systems divide into two types: –Those that utilize the acoustic feedback to increase the reverberation time directly. Philips MCR Carmen –And those that include a reverberation device in the electronics, and couple this device electronically to the hall. Lares Paoletti (Stagetec) ACS, SIAP Only the second type are capable of creating a dual or triple-slope decay
Feedback and coloration Any time there is significant acoustic feedback there will be coloration. –Acoustic feedback paths have complex frequency response, and this response is audible. This coloration must be minimized in a successful design. –There are no easy solutions. Almost all systems start with a multichannel design. With many channels the individual response variations in each channel tend to average out. But each channel must have its own microphone and speaker, and all devices must be separated physically by the reverberation radius. –This physical separation is tricky to realize in practice. –Alas, most available systems minimize the amount of coloration by minimizing the system gain. Most available systems are not capable of doing very much at all. This is sometimes an advantage, as Eckhard will tell. –Some available systems minimize the coloration by denying that feedback exists (ACS, and to some degree SIAP)
Lares System Lares uses a multichannel concept –But it uses an electronic trick to allow a single pair of microphones to drive a large number of output channels (typically four or eight) –As a result it becomes practical to place the microphones close to the performers. The result is a cleaner pickup. The pickup microphones contain less coloration and reverberation. The energy content in the 50 to 150ms time range can be minimized this way (and only this way).
Lares Block Diagram A typical Lares installation includes two pickup microphones and eight separate output channels. Each microphone is connected to each output channel through a separate, independently time varying reverberation device. The frequency dependence of the reverberant level, and the frequency dependence of the reverberation time can be separately adjusted. Lares also includes a noise generator and 1/3 octave analyzer for setting and verifying the overall system gain.
Lares is highly resistant to coloration This is achieved through the multichannel design, and the independent time variance. The type of time variance used minimizes the pitch-shift, which is not audible when the system is correctly adjusted. As a result a high reverberant level can be achieved, even when the pickup microphones are far from the sound sources. –And this is sometimes a problem. Customers turn the system up too high, or insist on placing the microphones too far away. –The result can be both muddiness and excessive coloration (at least to my ears.) –There are way too many existing Lares installations that have these problems!
Demonstrations of Lares
Exponential Decay Sabines breakthrough –Extensively studied by Morse, Beranek, Eyring, etc. –In rooms where the absorption is relatively uniformly distributed the decay of sound follows a straight line when plotted logarithmically. –When the decay is exponential we can precisely predict the ratio between the direct sound and the reflected sound in the ms time range. For computing sonic distance the direct sound may be augmented by reflections that arrive before 50ms. –At very short reverberation times the reflected energy is concentrated into times less than 50ms after the direct sound, and perceived distance is low, regardless of the direct/reflected ratio. –Moderate reverberation times (1.2 – 1.6 seconds) concentrate the energy between 50 and 150ms. Halls with these reverberation times can easily sound muddy. (mulmig or glauque)
Acoustic research through synthesis We do not need to use reflections to generate the perception of acoustics! –It is the total reflected energy in different time bands that matters, along with the spatial and frequency distribution of that energy. –We can synthesize reverberation by convolving an input signal with an impulse response sculpted from noise. This technique allows to investigate the effects of different energy profiles. –I decided to convolve four identically shaped noise bursts, each 46ms long, with a segment of the Rakes Progress. –These segments can be then strung together with different delays and amplitudes to form an arbitrary reverberation. For example, lets synthesize an exponential decay of 1.4 seconds RT, with a variable direct/reverberant ratio:
Synthetic impulse response linear amplitude scale log amplitude scale Synthetic impulse response from noise 1.4s exponential decay This is the sound of a one sample click at samples/sec. This is NOT music or speech.
Window averaging, direct/reverb = 0dB 25ms averaging window 100ms averaging window We can average the impulse response over a selected time period. Mathematically this is the same as the average response of the system to an input signal (phone or note) with a duration of the averaging period. The first window represents the response of the room to a 25ms sound, and the second to a 100ms sound. Note the EDT we perceive is HIGHLY dependent on the length of the note!
Schroeder Integration, direct/reverb = 0dB Schroeder Integration – reverse integration – represents the response of the room to a note of infinite duration. Jordans method of determining EDT takes some account of the strength of the direct sound. Schroeders method for EDT completely ignores the strength of the direct sound. Neither method is likely to predict the response of the room to speech or normal music.
Window Averaging, direct/reverb = -3dB 25ms Averaging Window100ms Averaging Window For a 25ms sound the effective reverberation time is 0.9 seconds, so at least these sounds are heard with high articulation. 100ms sounds on the other hand, are smoothed to nearly the same slope as the late reverberation time
Schroeder Integration, direct/reverb =-3dB Very long notes still show some dual-slope decay. Jordans method for EDT is sensitive to this difference, Schroeders is not.
Examples See surround encoded DTS exponential decay
Non-exponential decay direct/reverb = -3dB It is interesting to ask what happens when there is a high burst of very early reflections, followed by a relatively level energy curve out to beyond 160ms. This type of decay minimizes sonic distance, while maintaining reverberance and envelopment
Non-exponential decay direct/reverb = -3dB % amplitudes of the different time periods in dB % all dB values correspond to the energy content of the mix d1 = -1.7; % direct sound l1 = -1.7; % 20ms-60ms l2 = -8.5; %60ms-100ms l3 = -8.5; %100ms-140ms l4 = -8.5; %140ms-180ms l5 = -8.5; %180ms-220ms l6 = -10.2; %220ms-260ms l7 = -11.9; %260ms-300ms l8 = -13.6; %300ms-340ms This is the MATLAB code that sets up the non-linear reverberation. Note that for this example, the early reflections have equal energy to the direct sound. Sonically, it is much better if the early energy is –4dB to –6dB relative to direct.
Non-exponential decay direct/reverb = -3dB 25 ms averaging window100ms averaging window With this non-linear decay both 25ms sounds and 100ms sounds are perceived with high articulation. Longer notes and sounds also have high reverberance. Once again, it would be sonically more pleasant if the early reflections were reduced.
Examples See surround encoded DTS non-linear decay
Frequency Dependence We have so far been studying broadband reverberation. However human perception is highly frequency dependent. As a consequence, our perceptions of intelligibility, articulation, loudness, and sonic distance are primarily influenced by frequencies above 700Hz. However the perception of reverberance, warmth, and envelopment primarily arise from frequencies below 500Hz. It is possible to have both high clarity and high envelopment at the same time by carefully controlling the frequency dependence of the reflected energy.
The frequency transmission of the pinnae and middle ear From: B. C. J. Moore, B. R. Glasberg and T. Baer, A model for the prediction of thresholds, loudness and partial loudness, J. Audio Eng. Soc., vol. 45, pp (1997). The intensity of nerve firings is concentrated in the frequency range of human speech signals, about 700Hz to 4kHz. With a broad-band source, the ITD and IID at these frequencies will dominate the apparent direction.
Boston Symphony Hall, occupied, stage to front of balcony, 1000Hz
Boston Symphony Hall, occupied, stage to front of balcony, 250Hz
Adelade - Festival Center Theater
Conclusions There is an ideal acoustic profile for opera performance. This profile may or may not be achievable through conventional acoustics. –Our goal is not ideal acoustics, it is ideal SOUND. When restricting the design to conventional acoustics, the optimal sound as determined by a rapid A/B test is less reverberant than most conductors think they want in the absence of an A/B test, at least above 700Hz. An optimal design will maximize the reverb radius above 700Hz, aiming for a strongly dual-slope decay as measured by the decay time to –6dB of a 50ms to 100ms sound. –This goal is best achieved by directing the direct sound (and first reflections) from the soloists into the audience. The optimal design will maximize the reverberation time and the reverberant level below 500Hz. Given the choice between high clarity and a compromise that reduces clarity somewhat in favor of more reverberance for the orchestra, CHOOSE CLARITY!