Noise and Echo Control for Immersive Voice Communication in Spacesuits

Name: Noise and Echo Control for Immersive Voice Communication in Spacesuits
Uploaded: 2017-07-11T01:01:52+00:00
Duration: PTM34S35
Channel: Jared Pottle
Description: Noise and Echo Control for Immersive Voice Communication in Spacesuits

Noise and Echo Control for Immersive Voice Communication in Spacesuits
Presented as a keynote speech on the International Workshop on Acoustic Echo and Noise Control (IWAENC) in Tel Aviv, Israel on September 2, 2010 Noise and Echo Control for Immersive Voice Communication in Spacesuits Yiteng (Arden) Huang WeVoice, Inc., Bridgewater, New Jersey, USA 9/2/2010

Phase I feasibility research: Jan. 2008 – July 2008
About the Project Financially sponsored by the NASA SBIR (Small Business Innovation Research) program Phase I feasibility research: Jan – July 2008 Phase II prototype development: Jan – Jan. 2011 Other team members: Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA Jacob Benesty, University of Quebec, Montreal, Quebec, Canada

Outline Problem Identification and Research Motivation Problem Analysis and Technical Challenges Noise Control with Microphone Arrays Hardware Development Software Development A Portable, Real-Time Demonstration System Towards Immersive Voice Communication in Spacesuits

Section 1 Problem Identification and Research Motivation Problem Analysis and Technical Challenges Noise Control with Microphone Arrays Hardware Development Software Development A Portable, Real-Time Demonstration System Towards Immersive Voice Communication in Spacesuits

Requirements of In-Suit Audio
Speech Quality and Intelligibility: 90% word identification rate Hearing Protection: Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods Noise loads are very high during launch and orbital maneuvers. Audio Control and Interfaces: Provides manual silencing features and volume controls Operation at Non-Standard Barometric Pressure Levels (BPLs): Operates effectively between 30 kPa and 105 kPa

Current In-Suit Audio System
Current Solution: Communication Carrier Assembly (CCA) Audio System Skullcap Perspiration Absorption Area Earpiece Helmet Helmet Ring Chin Cup Microphone Module Microphone Boom

Extravehicular Mobility Unit (EMU) CCA
For shuttle and International Space Station (ISS) operations Source: O. Sands, NASA GRC Interconnect wiring Nylon/spondex top Teflon sidepiece and pocket Electret Microphone Interface cable and connector Ear seal Ear cup A large gain applied to the outbound speech for sufficient sound volume at low static pressure levels (30 kPa) leads to clipping and strong distortion during operations near sea-level BPL.

Advanced Crew Escape Suit (ACES) CCA
For shuttle launch and entry operations Source: O. Sands, NASA GRC Dynamic Microphones Hearing protection provided by the ACES CCA may not be sufficient.

Developmental CCA Ear Cups Source: O. Sands, NASA GRC Source: O. Sands, NASA GRC Noise Canceling Microphones Active In-Canal Earpieces The active earpieces will be used in conjunction with the CCA ear cups during launch and other high noise events and can be removed for other suited operations. The active earpieces alone nearly provide the required level of hearing protection.

CCA Systems: Pros High outbound speech intelligibility and quality, SNR near optimum Use close-talking microphones A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations A high degree of acoustic isolation between the inbound and outbound signals The human body does NOT transmit vibration-borne noise Provide very good hearing protection.

CCA Systems: Cons The microphones need to be close to the mouth of a suited subject. A number of recognized logistical issues and inconveniences: Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination. The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps. Wire fatigue for the microphone booms These problems cannot be resolved with incremental improvements to the basic design of the CCA systems.

Stakeholder Interviews
The CCA ear cups produce pressure points that cause discomfort. Microphone arrays and helmet speakers are suggested to be used. Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off): Clear two-way voice communications Hearing protection from the fan noise in the life support system ventilation loop Properly containing and managing hair and sweat inside the helmet Adequate SNR for the potential use of automatic speech recognition for the suit’s information system

Two Alternative Architectural Options for In-Suit Audio
Integrated Audio (IA): Instead of being housed in a separate subassembly, both the microphones and the speakers are integrated into the suit/helmet. Hybrid Approach: Employs the inbound portion of a CCA system with the outbound portion of an IA system. Helmet Speaker

Noise from Outside the Spacesuit
During launch, entry descent, and landing: Impulse noise < 140 dBSPL, Hazard noise < 105 dBA On orbit: Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping Limits on continuous on-orbit noise levels by frequency: Remark: During EVA operations, ambient noise is at most a minor problem. Band Center Frequency (Hz) 63 125 250 500 1k 2k 4k 8k 16k Sound Pressure Level (dB) 72 65 60 56 53 51 50 48 SPL (dB) 85 – 95 75 – 85 65 – 75 55 – 65 Perception Very High Noise: speech almost impossible to hear High Noise: speech is difficult to hear Medium Noise: Must Raise Voice to be Heard Low Noise: speech is easy to hear Typical Environments Construction Site Loud Machine Shop Noisy Manufacturing Assembly Line Crowded Bus/Transit Waiting Area Very Noisy Restaurant/Bar Department Store Band/Public Area Supermarket Doctor’s Office Hospital Hotel Lobby

Structure-Borne Noise Inside the Spacesuit
Four noise sources (Begault & Hieronymus 2007): Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation Arm, leg, and hip bearing noise Suit-impact noise, e.g., footfall Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments) For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled. For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.

Acoustic Challenges Complicated noise field:
Temporal domain: Has both stationary and non-stationary noise Spectral domain: Inherently wideband Spatial domain: Near field; Possibly either directional or dispersive Highly reverberant enclosure: The helmet is made of highly reflective materials. Strong reverberation dramatically reduces the intelligibility of speech uttered by the suit subject and degrades the performance of an automatic speech recognizer. Strong reverberation leads to a more dispersive noise field, which makes beamforming less effective.

Proposed Noise Control Scheme for IA/Hybrid Systems
Head Position Calibration Head Motion Tracker Mouth range and incident angle with respect to the microphone array Acoustic Source Localization 4 3 2 1 Microphone Array Beamforming Single Channel Noise Reduction Multichannel Noise Reduction Adaptive Noise Cancellation Outbound Speech 5 Noise Reference

Current Research Focus
4 Microphone Array 3 2 Beamforming Single Channel Noise Reduction 1 Multichannel Noise Reduction Outbound Speech

Beamforming: Far-Field vs. Near-Field
Sound Source of Interest S(f, θ) Far-Field Noise Far-Field Noise Plane Waves V(f, ψ) S(f, rs) Near-Field Sound Source … ψ … V(f, ψ) … rs (N-1)·d·cos(ψ) Plane Waves θ N . . . 2 d 1 . . . θ ψ XN(f) X2(f) X1(f) ... N 2 d 1 XN(f) X2(f) X1(f) ... ... hN  h2  h1  ... hN  h2  h1  Σ Σ Y(f, ψ, rs) Y(f, ψ, θ)

Fixed Beamformer vs. Adaptive Beamformer
Microphone Array Beamformers Noise Field? Stationary, Known before the design Time Varying, Unknown Isotropic noise generally assumed Fixed Beamformers Adaptive Beamformers Reverberation? Not Concerned Significant Delay-and-Sum Filter-and-Sum MVDR (Capon) LCMV (Frost)/GSC Delay-and-Sum Simple Non-uniform directional responses over a wide spectrum of frequencies Filter-and-Sum Complicated Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech MVDR (Capon) Only the TDOAs of the interested speech source need to be known – simple requirements. Reverberation causes the signal cancellation problem. Time-domain or frequency-domain LCMV (Frost)/GSC The impulse responses (IRs) from the source to the microphones have to be known or estimated. Errors in the IRs lead to the signal cancellation problem.

Comments on Traditional Microphone Array Beamforming
For incoherent noise sources, the gain in SNR is low if the number of microphones is small. For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations: Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones. Inconsistent responses of the microphones across the array. For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.

Multichannel Noise Reduction (MCNR)
A conceptual comparison of beamforming and MCNR: s(k) . . . d ... Beamforming xN(k) x2(k) x1(k) Speech Source of Interest 1 2 N Noise v(k) Impulse Responses gN g2 g1 Dereverberation and Denoising Knowledge related to the source position or gn Signal Model: x1,s(k) Only Denoising . . . ... MCNR xN(k) x2(k) x1(k) s(k) 1 2 N v(k) gN g2 g1 Beamformer: Spatial Filtering Array Setup: Calibration is necessary – possibly time/effort consuming MCNR: Statistical Filtering Array Setup: No need to strictly demand a specific array geometry/pattern

Frequency-Domain MVDR Filter for MCNR
The problem formulation: The MVDR filter: A more practical implementation: where Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.

Comparison of the MVDR Filters for Beamforming and MCNR
MVDR for Beamforming (BF): MVDR for MCNR: The acoustic impulse responses can at best be estimated up to a scale: where denotes the true response vector. Leads to speech distortion. Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.

Distortionless Multichannel Wiener Filter for MCNR
Use what we called the spatial prediction: Formulate the following optimization problem: where The distortionless multichannel Wiener (DW) filter for MCNR: The optimal Wiener solution for the non-causal spatial prediction filters: where So, It was found that

Single-Channel Noise Reduction (SCNR) for Post-Filtering
Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as MVDR Beamformer Wiener Filter for SCNR Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany: Sprinter, 2001. MCNR: Again, the Wiener filter can be factorized as MVDR for MCNR Wiener Filter for SCNR Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.

Single-Channel Noise Reduction (SCNR)
The signal model: SCNR filter: Error signal: MSE cost function: The Wiener filter: where and Other SCNR methods: Parametric Wiener filter, Tradeoff filter. A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.

New Idea for SCNR A second-order complex circular random variable (CCRV) has: which implies that and its conjugate are uncorrelated. In general, speech is not a second-order CCRV: But noise is a second-order CCRV if stationary, and not otherwise. Examine This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion. Correlated but not completely coherent Uncorrelated or coherent

Widely Linear Wiener Filter
New filter for SCNR: Error signal: Widely linear MSE: Then the widely linear Wiener filter or MVDR type of filters can be developed.

Computational Platform/Technology Selection
Three platforms under consideration: ASIC DSP FPGA Trade-off among performance, power consumption, size, and costs Four competing factors: The count of transistors employed The number of clock cycles required The time taken to develop an application Nonrecurring engineering (NRE) costs ASIC Low numbers of transistors and clock cycles Long development time and high NRE costs Effective in performance, power, and size, but not in cost DSP Low development and NRE costs Low power consumption More efforts to convert the design to ASICs FPGA Not suited to processing sequential conditional data flow, but efficient in concurrent applications Support faster I/O than DSPs One step closer to ASIC than DSP High development cost due to performance optimization

System Block Diagram Altera FPGA Digital Output Interface (USB 2.0)
XLR Male XLR Female MIC CAPSULE DB25 Male DB25 Female Mic. Powering Circuit 1 3 2 GND HOT COLD Mic. Powering Circuit 2 3 1 GND HOT COLD Mic. Preamps G Jumpers (for Gain Control) Digital Output Interface (USB 2.0) Power Mgmt IC Mic. Powering Circuit 3 2 1 GND HOT COLD JTAG (Male) . . Power Jack Mic. Powering Circuit 4 3 2 1 GND HOT COLD 8-ch 24-bit 48kHz ADC Altera FPGA . Mic. Powering Circuit 5 3 2 1 GND HOT COLD Flash . Mic. Powering Circuit 6 3 2 1 GND HOT COLD Analog Input . . SDRAM SDRAM Mic. Powering Circuit 7 3 2 1 GND HOT COLD FPGA Board Mic. Powering Circuit 8 3 2 1 GND HOT COLD

FPGA Board Block Diagram
USB 2.0 (High Speed) User LED/IOs OPA1632 (1) ADS1278 Altera Cyclone III EP3C55F484C8 FPGA 16 MB SDRAM (×32) OPA1632 (2) 3.3 V 16 MB SDRAM (×32) EPCS16 16 MB Flash (×16) OPA1632 (8) 50 MHz XTAL MHz XTAL

Prototype FPGA Board: the Top View
Phantom Power Feeding User LEDs User I/Os FT2232H USB 2.0 Jack GND OPA1632 REF1004 ADS1278 EPCS16S JTAG 12 MHz Crystal TPS65053 Flash DB25 DC Power Jack Power LED Mic. Pream Gain Jumpers Analog Power DC 9V Analog Power DC 5V Cyclone III FPGA SDRAMs 174.8 mm × 101 mm

Prototype FPGA Board: the Bottom View
OPA1632 MHz Clock Oscillator (OSC1) 50 MHz Clock Oscillator (OSC2)

FPGA System Development Flow Adopted in the Project
System on Programmable Chip (SoPC) + C/C++ Programming: Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA Develop software/DSP systems in C/C++ on the NIOS II processor CPU (NIOS II) ROM RAM I/O UART DSP Advantages: Short development cycle/time Low cost High reliability Reusability of intellectual property Drawbacks: Poor efficiency and low performance: Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)

MEMS Microphone Array a b d c Analog Device ADMP402 MEMS Microphones: 2.5 mm × 3.35 mm 1 7 2 3 4 5 6 5 mm 20 mm 7 Subarrays Pin 18 Pin 1 XG-MPC-MEMS 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 Samsung 18-pin Connector 3 2 1

MEMS Microphone Array Box
7 6 5 4 3 2 1 35 mm Wevoice MEMS Microphone Array 12.5 mm 155 mm Pin 1 Pin 18 Samsung 18-pin Connector

FPGA Program Flowchart
From ADC To USB From ADC To USB FPGA data in & preprocessing overlap add USB trans. data in & preprocessing overlap add USB trans. Nios II Soft Core MCNR+SCNR MCNR+SCNR . . FFT/IFFT Processor 4-ch FFT 4-ch FFT 1-ch IFFT 1-ch IFFT t t+4 t+8 1 time frame time (ms) Processing delay < 8 ms

IA System Windows Host Software
Programmed with Microsoft Visual C++ Direct Sound is used to play back audio (speech). Splash window of the program

IA System Windows Host GUI: Multitrack View

IA System Windows Host GUI: Single-Track View

IA System Windows Host GUI: Playing Back

The Portable, Real-Time Demo System
FPGA Board Power Supply: Linear DC 12-20V/1A Suited Subject DB25 Connectors PC USB 2.0 Cable MEMS Microphone Array Audio Cable

What is and Why do we want Immersive Communication?
Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints: Long distance Real time Physical boundaries Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints. Immersive communication offers an feeling of being together and sharing a common environment during collaboration. Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.

What need to be solved for immersive communication systems?
Single-Channel Acoustic Echo Cancellation

Multichannel Acoustic Echo Cancellation

Synthesized Stereo Audio Mixing System

Beamforming Blind Source Separation

Acoustic Source Localization and Tracking

Stereophony System for Spatial Sound Reproduction

Wave Field Synthesis

Why Immersive Voice Communication in Spacesuits?
Immersive voice communication exploits human’s binaural hearing. Provides enhanced situational awareness for a suited crewmember: Can improve the productivity of collaboration among the crewmembers Can produce potential safety benefits Crew comfort can be optimized.

What Problems Need to be Solved?
Stereo/Multichannel Acoustic Echo Cancellation (MCAEC) Integration of MCAEC and MCNR Three Dimensional (3D) Audio

Conclusions While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits. The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use. Noise reduction with microphone arrays Multichannel echo cancellation Integrated echo and noise control 3D audio We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach. We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction. We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.

Noise and Echo Control for Immersive Voice Communication in Spacesuits

Similar presentations

Presentation on theme: "Noise and Echo Control for Immersive Voice Communication in Spacesuits"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Noise and Echo Control for Immersive Voice Communication in Spacesuits

Similar presentations

Presentation on theme: "Noise and Echo Control for Immersive Voice Communication in Spacesuits"— Presentation transcript:

Similar presentations

About project

Feedback