Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conditional Random Fields   A form of discriminative modelling   Has been used successfully in various domains such as part of speech tagging and other.

Similar presentations


Presentation on theme: "Conditional Random Fields   A form of discriminative modelling   Has been used successfully in various domains such as part of speech tagging and other."— Presentation transcript:

1 Conditional Random Fields   A form of discriminative modelling   Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks   Processes evidence bottom-up   Combines multiple features of the data   Builds the probability P( sequence | data)

2 Conditional Random Fields   CRFs are based on the idea of Markov Random Fields   Modelled as an undirected graph connecting labels with observations   Observations in a CRF are not modelled as random variables /k/ /iy/ XXXXX Transition functions add associations between transitions from one label to another State functions help determine the identity of the state

3 Conditional Random Fields State Feature Function f([x is stop], /t/) One possible state feature function For our attributes and labels State Feature Weight λ=10 One possible weight value for this state feature (Strong) Transition Feature Function g(x, /iy/,/k/) One possible transition feature function Indicates /k/ followed by /iy/ Transition Feature Weight μ=4 One possible weight value for this transition feature  Hammersley-Clifford Theorem states that a random field is an MRF iff it can be described in the above form  The exponential is the sum of the clique potentials of the undirected graph

4 Conditional Random Fields   Conceptual Overview   Each attribute of the data we are trying to model fits into a feature function that associates the attribute and a possible label   A positive value if the attribute appears in the data   A zero value if the attribute is not in the data   Each feature function carries a weight that gives the strength of that feature function for the proposed label   High positive weights indicate a good association between the feature and the proposed label   High negative weights indicate a negative association between the feature and the proposed label   Weights close to zero indicate the feature has little or no impact on the identity of the label

5 Experimental Setup   Attribute Detectors   ICSI QuickNet Neural Networks   Two different types of attributes   Phonological feature detectors   Place, Manner, Voicing, Vowel Height, Backness, etc.   Features are grouped into eight classes, with each class having a variable number of possible values based on the IPA phonetic chart   Phone detectors   Neural networks output based on the phone labels – one output per label   Classifiers were applied to 2960 utterances from the TIMIT training set

6 Experimental Setup   Output from the Neural Nets are themselves treated as feature functions for the observed sequence – each attribute/label combination gives us a value for one feature function   Note that this makes the feature functions non-binary features.

7 Experiment 1   Goal: Implement a Conditional Random Field Model on ASAT-style phonological feature data   Perform phone recognition   Compare results to those obtained via a Tandem HMM system

8 Experiment 1 - Results Model Phone Accuracy Phone Correct Tandem [monophone] 61.48%63.50% Tandem [triphone] 66.69%72.52% CRF [monophone] 65.29%66.81%  CRF system trained on monophones with these features achieves accuracy superior to HMM on monophones  CRF comes close to achieving HMM triphone accuracy

9 Experiment 2   Goals:   Apply CRF model to phone classifier data   Apply CRF model to combined phonological feature classifier data and phone classifier data   Perform phone recognition   Compare results to those obtained via a Tandem HMM system

10 Experiment 2 - Results Model Phone Acc Phone Correct Tandem [mono] (phones) 60.48%63.30% Tandem [tri] (phones) 67.32%73.81% CRF [mono] (phones) 66.89%68.49% Tandem [mono] (phones/feas) 61.78%63.68% Tandem [tri] (phones/feas) 67.96%73.40% CRF [mono] (phones/feas) 68.00%69.58% Note that Tandem HMM result is best result with only top 39 features following a principal components analysis

11 Experiment 3   Goal:   Previous CRF experiments used phone posteriors for CRF, and linear outputs transformed via a Karhunen-Loeve (KL) transform for the HMM sytem   This transformation is needed to improve the HMM performance through decorellation of inputs   Using the same linear outputs as the HMM system, do our results change?

12 Experiment 3 - Results Model Phone Accuracy Phone Correct CRF (phones) posteriors 67.27%68.77% CRF (phones) linear KL 66.60%68.25% CRF (phones) post. + linear 68.18%69.87% CRF (features) posteriors 65.25%66.65% CRF (features) linear KL 66.32%67.95% CRF (features) post + linear 66.89%68.48% CRF (features) linear (no KL) 65.89%68.46% Also shown – Adding both feature sets together and giving the system supposedly redundant information leads to a gain in accuracy

13 Experiment 4   Goal:   Previous CRF experiments did not allow for realignment of the training labels   Boundaries for labels provided by TIMIT hand transcribers used throughout training   HMM systems allowed to shift boundaries during EM learning   If we allow for realignment in our training process, can we improve the CRF results?

14 Experiment 4 - Results Model Phone Accuracy Phone Correct Tandem [tri] (phones) 67.32%73.81% CRF (phones) no realign 67.27%68.77% CRF (phones) realign 69.63%72.40% Tandem [tri] (features) 66.69%72.52% CRF (features) no realign 65.25%66.65% CRF (features) realign 67.52%70.13% Allowing realignment gives accuracy results for a monophone trained CRF that are superior to a triphone trained HMM, with fewer parameters


Download ppt "Conditional Random Fields   A form of discriminative modelling   Has been used successfully in various domains such as part of speech tagging and other."

Similar presentations


Ads by Google