Presentation is loading. Please wait.

Presentation is loading. Please wait.

By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language.

Similar presentations


Presentation on theme: "By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language."— Presentation transcript:

1 By: Meghal Bhatt

2  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language.  The design of sphinx4 is based on patterns that have emerged from the design of past systems as well as new requirements based on that researchers currently want to explore.  Sphinx4 also includes several implementation of both simple and state of art technique.

3  It has different parts: 1) Recognizer 2) Decoder 3) linguistic 4) Acoustic model 5) Front end 6) Instrumentation

4  It recognizes the audio signal spoken by the human and the searches the same in the transcript file.  And it is capable of recognizing discreet and continuous speech.

5  The decoder of the sphinx -4 speech recognition systems incorporates several new designs strategies which have not been used in hmm based large vocabulary speech recognition systems.  Contains the search manager performs search using the algorithm used like breadth search, best first search, death first search and also contain feature scorer and pruner.  It uses the new aspects of graph construction by using multi level parallel decoding with independent simultaneous features streams without the use of compound HMM structure.

6  Performs the digital signal processing on the incmoing data. The sequence of operation performed by sphinx -4 front end is that it creates mel-cepstra from an audio file.  It also includes pluggable language model support for ASCII,, Hamming window, FFT, Mel frequency filter bank, discrete cosine transform, cepstral mean normalization and feature extraction of cepstra, delta cepstra features.

7  In sphin-4 we have two important models that are for difference purpose  TIDIGITS_8GAU_13dcep_16K_40 mel_130Hz_6800.jar is designed and created for number that you should use this model for the acoustic Model.  WSJ_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for the text data.if a user wants to recognize text then should use this model for the text.

8  Dictionary provides pronounciation for words found in language model. The pronounciations splits words into sequences of phonemes which which are found in the acoustic model.  Responsible for how the word is pronounced this is the main task.

9  It contains representation of probability of occurrence of words.There are basically two types of model that describe the language:  Statistical language model:  Statistical language model estimate the probability of the distribution of natural language. The most widely used statistical language model is N-gram.  Grammar language model:  Grammar describes a very simple parts and types of languages for command and control, and you are written by hand or is generated automatically by plain code.

10  Configuration file determines the configuration of a open source frame network sphinx-4. This configuration files defines the following: The different types of components and its names. The in between connectivity of the components how they corresponds to each other. And also shows the detailed configuration for each of these elements.

11  Basically there are three steps to use new model from sphinx-4  Defining a language model.  Defining a dictionary.  Defining a acoustic model.

12 <property name="grammarLocation“ value=" the path to the grammar folder "/> <property name="dictionary" value="dictionary"/> <property name="grammarName" value =“the name of grammar"/> <property name="logMath“ value="logMath"/>

13 <property name="location" value=" the path to the model folder "/> <property name="location" value=" the path to the model folder "/>

14 <property name="dictionaryPath" value=" the name of the dictionary file " value=" the name of the filler file "/> <property name="allowMissingWords" value="false"/> <property name="unitManager" value="unitManager"/>

15 Thank you


Download ppt "By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language."

Similar presentations


Ads by Google