Presentation on theme: "Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This."— Presentation transcript:
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This makes speech recognition less reliable. Choosing a Test Sound Two different test sounds were tested to find which sound worked better. One test sound was a sine wave increasing in frequency from 80Hz to 8000Hz. The other test sound was pink noise. Speech was recorded and filtered from four different people. These files were filtered using the coefficients produced by the least mean square algorithm. These recordings were then tested against two different speech recognizers, the one built into Windows XP and the one built into Windows 7. The Windows 7 recognizer had a higher baseline success rate than the XP recognizer. Overall, the filter created from the pink noise fixed more speech recognition errors than the other filter. Also, all but one of the phrases fixed by the filter from the sine wave were also fixed by the pink noise. For the Future Test more filter lengths, iterations, gains, sound files Insert filter into Windows Mobile recording stack Add options to the program to change the filter creation parameters Jonathan Brown: firstname.lastname@example.org <> Sam Marlin: email@example.com <> Advisor: W. T. Miller Proposed Solution Using digital signal processing, a filter will be created to “undo” the distortion caused by the poor quality microphone. This process will be able to generate a filter for any handheld that uses the Windows Mobile platform, creating a custom tailored filter based on the acoustic characteristics of each device. Reference audio files, with known frequency components, will be used to find what frequencies are attenuated by the handheld. Testing the Code All the code was first done in Matlab for testing purposes. The code was then ported to C# for final deployment. Save the Filter The filter coefficients are then saved into the registry of the handheld device for use by any audio recording or voice recognition application. Record Test Sound Play an ideal test sound from the computer while recording it on the handheld. Create the Filter The program on the computer will compare the test sound and the recorded sound to create the filter. Setup Setup computer, speakers and handheld device. Steps of the Solution Process Lining up the Sound Files Each test sound file had 10 cycles of a 440Hz sine wave at its start. This knowledge was used to line up the two sound files through cross- correlation. Problem The above equation did not line up the sound files for all time. The time steps in each of the sounds are different, after 1000 samples the files would noticeably unaligned. To fix this, cross-correlation was used again to match the indexes in one file to another. Creating the Filter The least mean square algorithm was used to create the filter coefficients. For this algorithm to work, the test files have to be lined up in time. This algorithm has many different variables, so tests were done to find best filter parameters to solve the problem. The sine wave test sound file was used in these parameter tests. Choosing the numbers depended on two values, the RMS of the error value used in the algorithm and if the filter coefficients changed by varying the iterations. Numbers used in testing: Gain: 0.001, 0.0001, 0.00001 Iterations: 500 to 3900 in steps of 200 Filter Size: 257 I = the ideal waveform NI = the non ideal waveform FC = the filter coefficients FS = filter size e = equalization error g = the gain Windows 7 Speech Recognizer NoiseUnfilteredFilter from Sine WaveFilter from Pink Noise Recognized158159167 Broke-56 Fixed-615 Final Values: Gain: 0.0001 Iterations: 1500 Filter Size: 257 Conclusions The filter developed using the pink noise test signal resulted in a statistically significant improvement in speech recognizer performance at the 90% confidence level (from 79% to 83.5 % correct). This indicates that the technique could provide a functionally significant improvement in practice, and warrants further investigation.