Presentation is loading. Please wait.

Presentation is loading. Please wait.

Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander.

Similar presentations


Presentation on theme: "Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander."— Presentation transcript:

1 Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander Stoytchev Developmental Robotics Lab Department of Electrical and Computer Engineering Iowa State University mamille@cs.iastate.edu, alexs@iastate.edu www.cs.iastate.edu/~mamille/

2 Iowa State University Developmental Robotics Laboratory Language: A Grand Challenge A working example Automatically acquires language Well studied

3 Iowa State University Developmental Robotics Laboratory Statistical Learning Experiments Saffran et. al. (1996): 8-month-olds can segment speech. Artificial Language: tupiro golabu bedaku padoti Language: tu pi ro go la bu be da ku Transition Prob: 1.0 1.0.25 1.0 1.0.25 1.0 1.0... Acclimate Novel Word Hypothesis: Infants use local minima in single syllable transition probabilities to segment speech streams.

4 Iowa State University Developmental Robotics Laboratory Voting Experts An algorithm for unsupervised segmentation Key Idea: Natural “chunks” have: –Low Internal Information –High Boundary Entropy itwasabrightcolddayinaprilandtheclockswere

5 Iowa State University Developmental Robotics Laboratory Voting Experts An algorithm for unsupervised segmentation Key Idea: Natural “chunks” have: –Low Internal Information –High Boundary Entropy itwasabrightcolddayinaprilandtheclockswere

6 Iowa State University Developmental Robotics Laboratory VE Implementation (Cohen 2006) 1.Build an n-gram trie from text. 2.Slide a window along the text sequence 3.Two experts vote how to break the window 1.One minimizes internal info 2.Other maximizes boundary entropy i t w a s a b r i g h t c o l d d a y i n a p r i l Window 1

7 Iowa State University Developmental Robotics Laboratory VE Implementation (Cohen 2006) 1.Build an n-gram trie from text. 2.Slide a window along the text sequence 3.Two experts vote how to break the window 1.One minimizes internal info 2.Other maximizes boundary entropy i t w a s a b r i g h t c o l d d a y i n a p r i l Window 2

8 Iowa State University Developmental Robotics Laboratory VE Implementation (Cohen 2006) 1.Build an n-gram trie from text. 2.Slide a window along the text sequence 3.Two experts vote how to break the window 1.One minimizes internal info 2.Other maximizes boundary entropy 4.Break at vote peaks i t w a s a b r i g h t c o l d d a y i n a p r i l i | t | w | a | s | a | b | r | i | g | h | t | c | o | l | d 0 3 1 03 201 1 0 0 61 00

9 Iowa State University Developmental Robotics Laboratory VE Results Results are surprisingly good on text –Especially giving its simplicity –Accuracy and Hit rate about 75% Seems to capture something about the nature of “chunks” Can we use this algorithm to segment real audio? Itwasabright

10 Iowa State University Developmental Robotics Laboratory Acoustic Model

11 Iowa State University Developmental Robotics Laboratory Acoustic Model Cluster spectral features using a GGSOM

12 Iowa State University Developmental Robotics Laboratory Acoustic Model Cluster spectral features using a GGSOM Collapse state sequence

13 Iowa State University Developmental Robotics Laboratory Acoustic Model Cluster spectral features using a GGSOM Collapse state sequence Run VE to get breaks

14 Iowa State University Developmental Robotics Laboratory Experiments and Results Used the model to segment “1984” –CD 1 of audio book (40 mins) –Chosen for length, consistency –Evaluation: Human graders

15 Iowa State University Developmental Robotics Laboratory New Experiments Trained on infant datasets Tested on manually generated keys Stream A: tupiro golabu bedaku padoti Stream B: dapiku tilado pagotu burobi Train Test Acoustic Model A Acoustic Model B VE Model A VE Model B Key A Key B

16 Iowa State University Developmental Robotics Laboratory New Experiments Trained on infant datasets Tested on manually generated keys Stream A: tupiro golabu bedaku padoti Stream B: dapiku tilado pagotu burobi Test Acoustic Model A Acoustic Model B VE Model A VE Model B Key B Key A

17 Iowa State University Developmental Robotics Laboratory Results Experiment 1 –Accuracy: 50% on all induced breaks –Hit Rate: 75% of word breaks –Significantly better than chance Experiment 2 –Accuracy: 16% on all induced breaks –Hit Rate: 1% of word breaks –Worse than chance –18 breaks, 3 correct

18 Iowa State University Developmental Robotics Laboratory Results

19 Iowa State University Developmental Robotics Laboratory Results

20 Iowa State University Developmental Robotics Laboratory Conclusions and Future Work VE Model can be used to segment audio Can reproduce the results of Infant studies May model part of the human chunking mechanism Have built more sophisticated acoustic models –Better results (nearly perfect)

21 Iowa State University Developmental Robotics Laboratory Discussion Suggestions Why? –Can’t we just engineer the solution? What is really needed for unsupervised speech segmentation? Can this model be used for object discovery in other domains?

22 Iowa State University Developmental Robotics Laboratory Thank You www.cs.iastate.edu/~mamille/


Download ppt "Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander."

Similar presentations


Ads by Google