Presentation is loading. Please wait.

Presentation is loading. Please wait.

MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

Similar presentations


Presentation on theme: "MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006."— Presentation transcript:

1 MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006

2 MUMT611: Music Information Acquisition, Preservation, and Retrieval Content Introduction Measurement of timbre Measurement of similarity Systems Evaluation Recent developments Conclusion

3 MUMT611: Music Information Acquisition, Preservation, and Retrieval Introduction Incomplete timbre definition – Timbre is a fundamental dimension of sound. – Timbre has been too often described as the dimension of sound that lets the listener makes distinction between two sounds that have the same pitch and the same loudness.

4 MUMT611: Music Information Acquisition, Preservation, and Retrieval Introduction Incomplete timbre definition – An efficient operational definition of timbre haven’t been already achieved. – Previous research demonstrated the multidimensional nature of timbre. – Existing timbre researches has already compared the similarity of the timbre of single instrumental notes.

5 MUMT611: Music Information Acquisition, Preservation, and Retrieval Introduction Physical features of timbre – Attack transients – Spectral flux – Spectral gravity centre – Harmonicity Ratio – Spectral/Temporal Envelope – Other factors: Pitch Loudness

6 MUMT611: Music Information Acquisition, Preservation, and Retrieval Introduction Global Timbre – A local definition of timbre appears to be useless for electronic music distribution development or music recommendation systems. – Researches use the concept of “global” timbre that attributes a timbre quality for an entire piece. – This idea only makes sense if there is only little variations in texture and instrumentation.

7 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of timbre Mel-Frequency Cepstrum Coeficient – Mel-Frequency Cepstrum Coefficient (MFCC) Spectral gravity centre Spectral envelope Spectral Flux Combines those measures in a “feature vector”

8 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of timbre Mel-Frequency Cepstrum Coefficient – It is a measure of the spectral envelope variations. – Consist of a mapping of the linear frequencies to the psychoacoustically-based Mel scale. – It results an ordered sequence of coefficients. – Low-order coefficients describe slow temporal changes of the spectral envelope. – High-order coefficients describe fast changes.

9 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of Similarity Similarity Metric – Metrics are applied to calculate the distance between two representations and determine the similarity of the music. – Should be related to strategy used by humans in similarity judgments of timbre.

10 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of Similarity Gaussian Mixture Model – MFCC involves a large amount of coefficients. – It is necessary to get a more compact representation to handle those results.

11 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of Similarity Gaussian Mixture Model – GMM is composed of one or more components Gaussian probability distributions. – Distance between GMM’s can be seen as a measurement of the similarity. – Random probabilities are computed from each song to be compared. – Samples are taken from both songs to be compared.

12 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of Similarity Gaussian Mixture Model – “Distance” between GMM’s can be seen as a measurement of the similarity. – “Distance” is the amount of necessary changes to obtain samples of the second song from the first one. – The higher are those probabilities, the higher the similarity is.

13 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of Similarity Gaussian Mixture Model J. Aucouturier et al, 2004 “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.

14 MUMT611: Music Information Acquisition, Preservation, and Retrieval Measurement of Similarity Different Approaches – Neural Networks – Hidden Markov Model – Gaussian Mixture Models – Self-Organizing Map

15 MUMT611: Music Information Acquisition, Preservation, and Retrieval Systems Evaluation Evaluation criteria – Timbre similarity judgment is based on a set of objective and subjective perceptual, cognitive and cultural aspects. – Measure are highly dependent of music present in the database.

16 MUMT611: Music Information Acquisition, Preservation, and Retrieval Systems Evaluation Objective Evaluation – The objective evaluation of timbral similarity measure is problematic. – Metadatas of a given database include description of the artist and of the genre. However, timbre quality is not usually described in it.

17 MUMT611: Music Information Acquisition, Preservation, and Retrieval Systems Evaluation Subjective Evaluation – Conducting a psychoacoustical survey – Deciding whether two songs have similar timbre can be uncertain as it is an ill-defined concept.

18 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Aucouturier and Pachet (2002) – Segmentation of each song using invariable 50 ms windows. – Make use of a 8 coefficient MFCC to characterize each segments. – Used Gaussian Mixture Model composed of three Gaussian probability distribution. – 100 random samples are taken for similarity measurement.

19 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Aucouturier and Pachet (2002) J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.

20 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Aucouturier and Pachet (2004) – Finding the best set of parameters Sampling rate of the music signal Number of MFCCs extracted from each frame of data Number of components used in the GMM The distance sample rate to estimate the likelihood of one model given another Window size

21 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Aucouturier and Pachet (2004) J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.

22 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Aucouturier and Pachet (2004) – Alternative similarity measurements using Earth Mover’s Distance and Hidden Markov Model. – Those techniques didn’t improved the performances. – Bring the idea that there could exist a ceiling for the performance of technique involving timbre similarity.

23 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Liu and Huang (2000) – Developed an algorithm for singing voice. – Used MFCC as well as GMM for their timbre representation. – The segmentation of audio signal is done according to the phonemes in singing.

24 MUMT611: Music Information Acquisition, Preservation, and Retrieval Recent Developments Logan and Salomon (2001) – Characterized timbre with MFCC. – Used K-means clustering instead of GMM. – Calculate the amount of similarity using Earth Mover’s Distance.

25 MUMT611: Music Information Acquisition, Preservation, and Retrieval Conclusion

26 MUMT611: Music Information Acquisition, Preservation, and Retrieval Bibliography J. Aucouturier, F. Pachet, and Mark Sandler. 2004. “The way it sounds”: Timbre models for analysis and retrieval of music signals. IEEE Transaction on multimedia. J. Aucouturier, and F. Pachet. 2004. Improving timbre similarity : How high’s the sky ? Proceedings of the International Conference on Music Information Retrieval. J. Aucouturier, and F. Pachet. 2002. Music similarity measures: What’s the use ? Proceedings of the International Conference on Music Information Retrieval. C. Liu, and C. Huang. 2002. A singer identification technique for content-based classification of mp3 music object. Proceeding of the Conference on Information and Knowledge Management. B. Logan, and A. Salomon. 2001. A music similarity function based on signal analysis. Proceeding of the International Conference on Multimedia and Expo.


Download ppt "MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006."

Similar presentations


Ads by Google