Presentation on theme: "Noise and Echo Control for Immersive Voice Communication in Spacesuits"— Presentation transcript:
1Noise and Echo Control for Immersive Voice Communication in Spacesuits Presented as a keynote speech on theInternational Workshop on Acoustic Echo andNoise Control (IWAENC) in Tel Aviv, Israelon September 2, 2010Noise and Echo Control for Immersive Voice Communication in SpacesuitsYiteng (Arden) HuangWeVoice, Inc., Bridgewater, New Jersey, USA9/2/2010
2Phase I feasibility research: Jan. 2008 – July 2008 About the ProjectFinancially sponsored by the NASA SBIR (Small Business Innovation Research) programPhase I feasibility research: Jan – July 2008Phase II prototype development: Jan – Jan. 2011Other team members:Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USAScott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USAJacob Benesty, University of Quebec, Montreal, Quebec, Canada
3OutlineProblem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
4Section 1Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
5Requirements of In-Suit Audio Speech Quality and Intelligibility:90% word identification rateHearing Protection:Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periodsNoise loads are very high during launch and orbital maneuvers.Audio Control and Interfaces:Provides manual silencing features and volume controlsOperation at Non-Standard Barometric Pressure Levels (BPLs):Operates effectively between 30 kPa and 105 kPa
6Current In-Suit Audio System Current Solution: Communication Carrier Assembly (CCA) Audio SystemSkullcapPerspirationAbsorptionAreaEarpieceHelmetHelmet RingChin CupMicrophoneModuleMicrophone Boom
7Extravehicular Mobility Unit (EMU) CCA For shuttle and International Space Station (ISS) operationsSource: O. Sands, NASA GRCInterconnect wiringNylon/spondex topTeflon sidepiece and pocketElectret MicrophoneInterface cable and connectorEar sealEar cupA large gain applied to the outbound speech for sufficient sound volume at low static pressure levels (30 kPa) leads to clipping and strong distortion during operations near sea-level BPL.
8Advanced Crew Escape Suit (ACES) CCA For shuttle launch and entry operationsSource: O. Sands, NASA GRCDynamic MicrophonesHearing protection provided by the ACES CCA may not be sufficient.
9Developmental CCAEar CupsSource: O. Sands, NASA GRCSource: O. Sands, NASA GRCNoise Canceling MicrophonesActive In-Canal EarpiecesThe active earpieces will be used in conjunction with the CCA ear cups during launch and other high noise events and can be removed for other suited operations.The active earpieces alone nearly provide the required level of hearing protection.
10CCA Systems: ProsHigh outbound speech intelligibility and quality, SNR near optimumUse close-talking microphonesA high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizationsA high degree of acoustic isolation between the inbound and outbound signalsThe human body does NOT transmit vibration-borne noiseProvide very good hearing protection.
11CCA Systems: ConsThe microphones need to be close to the mouth of a suited subject.A number of recognized logistical issues and inconveniences:Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hoursThe close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.Wire fatigue for the microphone boomsThese problems cannot be resolved with incremental improvements to the basic design of the CCA systems.
12Stakeholder Interviews The CCA ear cups produce pressure points that cause discomfort.Microphone arrays and helmet speakers are suggested to be used.Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):Clear two-way voice communicationsHearing protection from the fan noise in the life support system ventilation loopProperly containing and managing hair and sweat inside the helmetAdequate SNR for the potential use of automatic speech recognition for the suit’s information system
13Two Alternative Architectural Options for In-Suit Audio Integrated Audio (IA): Instead of being housed in a separate subassembly, both the microphones and the speakers are integrated into the suit/helmet.Hybrid Approach: Employs the inbound portion of a CCA system with the outbound portion of an IA system.Helmet Speaker
14Section 2Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
15Noise from Outside the Spacesuit During launch, entry descent, and landing:Impulse noise < 140 dBSPL, Hazard noise < 105 dBAOn orbit:Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleepingLimits on continuous on-orbit noise levels by frequency:Remark: During EVA operations, ambient noise is at most a minor problem.Band Center Frequency (Hz)631252505001k2k4k8k16kSound Pressure Level (dB)7265605653515048SPL (dB)85 – 9575 – 8565 – 7555 – 65PerceptionVery High Noise: speech almost impossible to hearHigh Noise: speech is difficult to hearMedium Noise: Must Raise Voice to be HeardLow Noise: speech is easy to hearTypical EnvironmentsConstruction SiteLoud Machine ShopNoisy ManufacturingAssembly LineCrowded Bus/Transit Waiting AreaVery Noisy Restaurant/BarDepartment StoreBand/Public AreaSupermarketDoctor’s OfficeHospitalHotel Lobby
16Structure-Borne Noise Inside the Spacesuit Four noise sources (Begault & Hieronymus 2007):Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulationArm, leg, and hip bearing noiseSuit-impact noise, e.g., footfallSwishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.
17Acoustic Challenges Complicated noise field: Temporal domain: Has both stationary and non-stationary noiseSpectral domain: Inherently widebandSpatial domain: Near field; Possibly either directional or dispersiveHighly reverberant enclosure:The helmet is made of highly reflective materials.Strong reverberation dramatically reduces the intelligibility of speech uttered by the suit subject and degrades the performance of an automatic speech recognizer.Strong reverberation leads to a more dispersive noise field, which makes beamforming less effective.
18Section 3Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
19Proposed Noise Control Scheme for IA/Hybrid Systems Head Position CalibrationHead Motion TrackerMouth range and incident angle with respect to the microphone arrayAcoustic Source Localization4321Microphone ArrayBeamformingSingle Channel Noise ReductionMultichannelNoiseReductionAdaptive Noise CancellationOutbound Speech5Noise Reference
20Current Research Focus 4Microphone Array32BeamformingSingle Channel Noise Reduction1MultichannelNoiseReductionOutbound Speech
22Fixed Beamformer vs. Adaptive Beamformer Microphone Array BeamformersNoise Field?Stationary, Known before the designTime Varying, UnknownIsotropic noise generally assumedFixed BeamformersAdaptive BeamformersReverberation?Not ConcernedSignificantDelay-and-SumFilter-and-SumMVDR (Capon)LCMV (Frost)/GSCDelay-and-SumSimpleNon-uniform directional responses over a wide spectrum of frequenciesFilter-and-SumComplicatedUniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speechMVDR (Capon)Only the TDOAs of the interested speech source need to be known – simple requirements.Reverberation causes the signal cancellation problem.Time-domain or frequency-domainLCMV (Frost)/GSCThe impulse responses (IRs) from the source to the microphones have to be known or estimated.Errors in the IRs lead to the signal cancellation problem.
23Comments on Traditional Microphone Array Beamforming For incoherent noise sources, the gain in SNR is low if the number of microphones is small.For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.Inconsistent responses of the microphones across the array.For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.
24Multichannel Noise Reduction (MCNR) A conceptual comparison of beamforming and MCNR:s(k). . .d...BeamformingxN(k)x2(k)x1(k)Speech Sourceof Interest12NNoisev(k)Impulse ResponsesgNg2g1Dereverberation and DenoisingKnowledge related to the source position or gnSignal Model:x1,s(k)Only Denoising. . ....MCNRxN(k)x2(k)x1(k)s(k)12Nv(k)gNg2g1Beamformer: Spatial FilteringArray Setup: Calibration is necessary – possibly time/effort consumingMCNR: Statistical FilteringArray Setup: No need to strictly demand a specific array geometry/pattern
25Frequency-Domain MVDR Filter for MCNR The problem formulation:The MVDR filter:A more practical implementation:whereSimilar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.
26Comparison of the MVDR Filters for Beamforming and MCNR MVDR for Beamforming (BF):MVDR for MCNR:The acoustic impulse responses can at best be estimated up to a scale:wheredenotes the true response vector.Leads to speech distortion.Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.
27Distortionless Multichannel Wiener Filter for MCNR Use what we called the spatial prediction:Formulate the following optimization problem:whereThe distortionless multichannel Wiener (DW) filter for MCNR:The optimal Wiener solution for the non-causal spatial prediction filters:where So,It was found that
28Single-Channel Noise Reduction (SCNR) for Post-Filtering Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized asMVDR BeamformerWiener Filter for SCNRNote: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book.M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany: Sprinter, 2001.MCNR: Again, the Wiener filter can be factorized asMVDR for MCNRWiener Filter for SCNRNote: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book.J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.
29Single-Channel Noise Reduction (SCNR) The signal model:SCNR filter:Error signal:MSE cost function:The Wiener filter:whereandOther SCNR methods: Parametric Wiener filter, Tradeoff filter.A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.
30New Idea for SCNRA second-order complex circular random variable (CCRV) has:which implies that and its conjugate are uncorrelated.In general, speech is not a second-order CCRV:But noise is a second-order CCRV if stationary, and not otherwise.ExamineThis is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.Correlated but not completely coherentUncorrelated or coherent
31Widely Linear Wiener Filter New filter for SCNR:Error signal:Widely linear MSE:Then the widely linear Wiener filter or MVDR type of filters can be developed.
32Section 4Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
33Computational Platform/Technology Selection Three platforms under consideration:ASICDSPFPGATrade-off among performance, power consumption, size, and costsFour competing factors:The count of transistors employedThe number of clock cycles requiredThe time taken to develop an applicationNonrecurring engineering (NRE) costsASICLow numbers of transistors and clock cyclesLong development time and high NRE costsEffective in performance, power, and size, but not in costDSPLow development and NRE costsLow power consumptionMore efforts to convert the design to ASICsFPGANot suited to processing sequential conditional data flow, but efficient in concurrent applicationsSupport faster I/O than DSPsOne step closer to ASIC than DSPHigh development cost due to performance optimization
36Prototype FPGA Board: the Top View Phantom Power FeedingUser LEDsUser I/OsFT2232HUSB 2.0 JackGNDOPA1632REF1004ADS1278EPCS16SJTAG12 MHz CrystalTPS65053FlashDB25DC Power JackPower LEDMic. Pream Gain JumpersAnalog Power DC 9VAnalog Power DC 5VCyclone III FPGASDRAMs174.8 mm × 101 mm
38FPGA System Development Flow Adopted in the Project System on Programmable Chip (SoPC) + C/C++ Programming:Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGADevelop software/DSP systems in C/C++ on the NIOS II processorCPU (NIOS II)ROMRAMI/OUARTDSPAdvantages:Short development cycle/timeLow costHigh reliabilityReusability of intellectual propertyDrawbacks:Poor efficiency and low performance:Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)
41Section 5Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
42FPGA Program Flowchart From ADCTo USBFrom ADCTo USBFPGAdata in & preprocessingoverlap addUSB trans.data in & preprocessingoverlap addUSB trans.Nios II Soft CoreMCNR+SCNRMCNR+SCNR..FFT/IFFT Processor4-ch FFT4-ch FFT1-ch IFFT1-ch IFFTtt+4t+81 time frametime (ms)Processing delay < 8 ms
43IA System Windows Host Software Programmed with Microsoft Visual C++Direct Sound is used to play back audio (speech).Splash window of the program
47Section 6Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
48The Portable, Real-Time Demo System FPGA BoardPower Supply: Linear DC 12-20V/1ASuited SubjectDB25ConnectorsPCUSB 2.0 CableMEMS Microphone ArrayAudio Cable
49Section 7Problem Identification and Research MotivationProblem Analysis and Technical ChallengesNoise Control with Microphone ArraysHardware DevelopmentSoftware DevelopmentA Portable, Real-Time Demonstration SystemTowards Immersive Voice Communication in Spacesuits
50What is and Why do we want Immersive Communication? Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:Long distanceReal timePhysical boundariesModern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.Immersive communication offers an feeling of being together and sharing a common environment during collaboration.Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.
51What need to be solved for immersive communication systems? Single-Channel Acoustic Echo Cancellation
52What need to be solved for immersive communication systems? Multichannel Acoustic Echo Cancellation
53What need to be solved for immersive communication systems? Synthesized Stereo Audio Mixing System
54What need to be solved for immersive communication systems? BeamformingBlind Source Separation
55What need to be solved for immersive communication systems? Acoustic Source Localization and Tracking
56What need to be solved for immersive communication systems? Stereophony System for Spatial Sound Reproduction
57What need to be solved for immersive communication systems? Wave Field Synthesis
58Why Immersive Voice Communication in Spacesuits? Immersive voice communication exploits human’s binaural hearing.Provides enhanced situational awareness for a suited crewmember:Can improve the productivity of collaboration among the crewmembersCan produce potential safety benefitsCrew comfort can be optimized.
59What Problems Need to be Solved? Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)Integration of MCAEC and MCNRThree Dimensional (3D) Audio
60ConclusionsWhile it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.Noise reduction with microphone arraysMultichannel echo cancellationIntegrated echo and noise control3D audioWe explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.