Detecting Speech Project 1. Outline Motivation Problem Statement Details Hints.

Slides:

Advertisements

Similar presentations

While you are waiting for this session to begin please make sure your audio is working. Go to the Tools menu, select Audio and then Audio setup wizard.

Advertisements

Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.

An Algorithm for Determining the Endpoints for Isolated Utterances L.R. Rabiner and M.R. Sambur The Bell System Technical Journal, Vol. 54, No. 2, Feb.

Sound can make multimedia presentations dynamic and interesting.

Part A Multimedia Production Rico Yu. Part A Multimedia Production Ch.1 Text Ch.2 Graphics Ch.3 Sound Ch.4 Animations Ch.5 Video.

Chapter 4: Representation of data in computer systems: Sound OCR Computing for GCSE © Hodder Education 2011.

Evaluation of Speech Detection Algorithm Project 1b Due October 11.

Project 1b Evaluation of Speech Detection Due: February 17 th, at the beginning of class.

Hierarchy of Design Voice Controlled Remote Voice Input Control Path Speech Processing IR Interface.

Technology ICT Option: Audio.

Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.

Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.

Digital audio recording Kimmo Tukiainen. My background playing music since I was five first time in a studio at fourteen recording on my own for six months.

Audio Basic Concepts. Audio in Multimedia Digital Audio: Sound that has been captured or created electronically by a computer In a multimedia production,

Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.

AUTOMATIC ORGANIZING AND FORMATTING FOR LECTURE NOTES SHIQING (LICIA) HE ADIVISOR: PROF.KRISTINA STRIEGNITZ SPRING 2014 STRUCTURING THE UNSTRUCTURED NOTE:

Bolo – A Simple Audioconference CS525u Multimedia Computing Due date: Project 2.

Speak A Simple VoIP Application CS529 Multimedia Networking Due date: October 21 st by 11:59pm Project 2.

Audio data Skills: Set sample size and rate in Audacity IT concepts: analog to digital conversion, digital to analog conversion, sample rate, sample size,

Conceptual Level Interaction Johnny’s Big Half-Baked Idea Sept 2001.

An Algorithm for Determining the Endpoints for Isolated Utterances L.R. Rabiner and M.R. Sambur The Bell System Technical Journal, Vol. 54, No. 2, Feb.

Speak – A Simple Audioconference CS529 Multimedia Networking Due date: November 3rd Project 2.

Project 1 Speech Detection Due: Sunday, February 1 st, 11:59pm.

Auditory User Interfaces

Evaluation of Speech Detection Algorithm Project 1b Due February 14th.

Real-Time Speech Recognition Thang Pham Advisor: Shane Cotter.

Speech Detection Project 1. Outline Motivation Problem Statement Details Hints.

Sound Chapter Types of Sound Waveforms MIDI Sound is related to many things in computers but only Wav and MIDI exist in PCs.

Systems Software Operating Systems.

Representation of Data in Computer Systems

Speak A Simple VoIP Application Project 2 Due date: March 3 rd by 11:59pm.

Computer Programming For Musical Applications II Tutorial 08 Recording in SuperCollider 21st November, 2008.

Sound Editing Techniques & Programs Combine Images with Audio.

Smart Home Design Based On Voice Recognition

Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.

VoIP, Asterisk, and Java Michael P. Plezbert Agilis Systems, Inc St. Louis Java Users Group April 13, 2006.

Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis

Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis

Digital Music to Sheet Music Hugh Smith. Abstract Electronic music has been steadily expand- ing over the past years. Many file formats have come into.

Unit 1_9 Human Computer Interface. Why have an Interface? The user needs to issue instructions Problem diagnosis The Computer needs to tell the user what.

Analogue vs Digital. Analogue  Lots of different frequencies, lots of different amplitudes  Wave recorded as it is.

Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.

Cisco Public © 2012 Cisco and/or its affiliates. All rights reserved. 1.

Temple University QUALITY ASSESSMENT OF SEARCH TERMS IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone, PhD Department of Electrical and Computer.

Multimedia Technology and Applications Chapter 2. Digital Audio

1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.

Digital Music to Sheet Music Hugh Smith. Abstract Electronic music has been steadily expand- ing over the past years. Many file formats have come into.

Glencoe Introduction to Multimedia Chapter 8 Audio 1 sound effect An artificially created or enhanced sound used to achieve an effect (without speech or.

PROPOSAL : The Use of Voice Command in Operating Personal Computer By : COLLEGE OF ART & SCIENCE UNIVERSITI UTARA MALAYSIA STIW5023 ADVANCED PROGRAMMING.

Rhythmic Party Music Sync Never again will your music be out of phase at your home party. Easily sync music playing on multiple systems.

Automatic Equalization for Live Venue Sound Systems Damien Dooley, Final Year ECE Progress To Date, Monday 21 st January 2008.

Audio Streaming © Nanda Ganesan, Ph.D.. Audio File Features Audio file is a record of captured sound that can be played back –The WAV File is an example.

HAT development and experiment Kyoungae Kim, SNU Korea

CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.

Sound. Sound Capture We capture, or record, sound by a process called sampling: “measuring” the sound some number of times per second. Sampling rate is.

Business-logic Layer Presentation Layer Network Layer Digital Signal Processing Layer SmartHome API SmartHome Software Architecture SH mobile application.

Editing Digital AudioLab#7 Audacity is a free, easy-to-use and an open source platform audio editor and recorder for Windows, Mac OS, Linux and other operating.

Assignment 1 – Voice Activated Systems Meryem Gurel PowerPack : Physical Computing, Wireless Networks and Internet of Things 10/7/2013 German W Aparicio.

Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.

Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.

GSM Gateway ARIA TELECOM SOLUTIONS PVT. LTD..

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)

ASTERISK “Open Source Communications Platform”

Speech recognition in mobile environment Robust ASR with dual Mic

Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)

Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)

ConnectPro User Guide for Students

Chapter 4: Representing sound

An Algorithm for Determining the Endpoints for Isolated Utterances

Presentation transcript:

Detecting Speech Project 1

Outline Motivation Problem Statement Details Hints

Motivation Word recognition needs to detect word boundaries in speech “Silence Is Golden”

Motivation Recognizing silence can reduce: –Network bandwidth –Processing load Easy in sound proof room, with digitized tape –Measure energy level in digitized voice

Research Problem Noisy computer room has loud background noise, making some edges difficult “Five”

Research Problem Computer audio often for interactive applications –Voice commands –Teleconferencing  Needs to be done in ‘real-time’

Project Solution Implement end-point algorithm by Rabiner and Sambur –(Paper for class, next) Implementation in Linux Basis for audioconference/Internet phone –(Projects 2 and 3)

Details Voice-quality: –8000 samples/second –8 bits per sample –One channel Record sound, write files: –sound.all - audio plus silence –sound.speech - audio no silence –sound.data - text-based data: audio data, energy, zero crossings: Other features allowed

Sound in Linux Linux audio device just like a file: –/dev/dsp –open("/dev/dsp", O_RDWR) Recording and Playing by: –read() to record –write() to play

Sound Parameters Use ioctl() to change sound card parameters To change sample size to 8 bits: fd = open("/dev/dsp", O_RDWR); arg = 8; ioctl(fd, SOUND_PCM_WRITE_BITS, &arg); Remember to error check all system calls!

Sound Parameters The parameters you will be interested in are: –SOUND_PCM_WRITE_BITS the number of bits per sample –SOUND_PCM_WRITE_CHANNELS mono or stereo –SOUND_PCM_WRITE_RATE sample/playback rate

Program Template open sound device set sound device parameters record silence set algorithm parameters while(1) record sound compute algorithm stuff detect speech write data to file write sound to file if speech, write speech to file

Hand In Staggered due dates, about 2 weeks –Send group info Turn in: –Code –Makefile –Answers to questions Via