Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.

Similar presentations


Presentation on theme: "Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05."— Presentation transcript:

1 Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05

2 Problem Description Given Wav signal of a pop song Discover the structure of the song Intro Verse Chorus Bridge Outro

3 HMM Framework Model the music signal as a series of state transitions …… Observations Hidden States

4 HMM Framework: Hierarchical HMM …… Observations Hidden States at Frame Level IntroVerseOutro Hidden States at Structure Level Each observation is an audio frame of one beat length

5 Representing a HHMM HHMM parameters Prior of each state at structure level and frame level π State transition probabilities at structure level and frame level α Emission parameters for each state at both levels Each state is modeled as a mixture of Gaussians Mean μ and covariance matrices Σ of each Gaussian

6 Training a HHMM EM for HHMM Look for maximum likelihood state sequence and model parameters M-step: Best state sequence Backward-forward algorithm Viterbi algorithm E-step: Parameter estimation Priors at both levels π State transition probabilities α Emission parameters - Gaussian mixture mean μ and covariance matrices Σ

7 Preprocessing Beat detection Segment the music into beat-length frames Feature extraction Repetition related feature (chorus/nonchorus) – Chroma vector Intensity related feature (vocal/nonvocal) - Subband based Log Frequency Power Coefficients Pitch related features – narrowband spectrogram features (Hann windowed FFT coefficients) And possibly more….under investigation

8 Tasks HHMM on a test song Songs with I-V1-C1-V2-C2-(V3-C3)-B-O structure Manually label structures as ground truth Predefine the number of states at both structure and frame levels Preprocessing Model fitting Evaluation Accuracy of structure identification Accuracy of structure timing

9 Reference Y. Wang, M.-Y. Kan, T. L. New, A. Shenoy, J. Yin, “LyricAlly: Automatic Synchronization of Acoustical Musical Signals and Textual Lyrics”, ACM MM 2004 C. Raphael, “A Hybrid Graphical Model For Aligning Polyphonic Audio With Musical Scores”, ISMIR 2004 C. Raphael, “Automatic Segmentation of Acoustic Musical Signals Using Hidden Markov Models”, IEEE Trans on PAMI, 1999 P. J. Walmsley, S. J. Godsill, P. J. W. Rayner, “Polyphonic Pitch Tracking Using Joint Bayesian Estimation of Multiple Frame Parameters”, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1999 L. Xie, S.-F. Chang, A. Divakaran, H. Sun, “Learning Hierarchical Hidden Markov Models for Video Structure Discovery”, Tech Report, Columbia Univ, 2002 L. Xie, S.-F. Chang, A. Divakaran, H. Sun, “Unsupervised Mining of Statistical Temporal Structures in Video”, Video Mining, Ch 10, Kluwer Academic Publishers, 2003 R. J. Turetsky, D. P. W. Ellis, “Ground-Truth Transcriptions of Real Music from Force-Aligned MIDI Synthesis”, ISMIR 2003


Download ppt "Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05."

Similar presentations


Ads by Google