Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.

Similar presentations


Presentation on theme: "Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics."— Presentation transcript:

1 Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics (SMC2004), page. 562-567, The Hague,The Netherlands, October 2004 Speaker : 李孟倫 Date : 2008/10/7

2 Outline Introduction System Overview Feature Extraction Classification Experimental Results Concluding Remarks

3 Outline Introduction System Overview Feature Extraction Classification Experimental Results Concluding Remarks

4 Introduction Few works have dealt with musical genre classification Most of such works have focused on relatively few classes of very distinct musical genres Most of the works have used non-parametric classification strategies and have dealt with small databases Application: Assist or replace the human user in this process as well as provide an important component for a complete music information retrieval system for audio signals

5 Outline Introduction System Overview Feature Extraction Classification Experimental Results Concluding Remarks

6 System Overview The classification system is composed by three main stages: Feature Extraction Classification Combination & Decision

7 An overview of the proposed musical genre classification approach System Overview

8 Feature Extranction Carried out from three selected regions of the music clip From each regions, a 15-dimensional feature vector is generated System Overview

9 Classification A music whose genre is unknown, is submitted to the system. From such a music clip are extracted three feature vectors from the corresponding regions which feed the classifiers Each classifier provides at the output a class and a confidence score System Overview

10 Combination & Decision The output of the classifiers are combined through a majority voting rule to decide the final class to be assigned to the input music clip

11 Outline Introduction System Overview Feature Extraction Classification Experimental Result Concluding Remarks

12 Feature Extraction Tow different types of features : Musical surface features Beat-related features

13 Feature Extraction The feature presented are based on the short- time Fourier transform (STFT) and are calculated for every short–time frame of sound Musical surface features Spectral Centroid Spectral Rolloff

14 Feature Extraction Musical surface features Spectral Flux Time Domain Zero-Crossings Low-Energy

15 Feature Extraction Beat-Related Features Relative amplitudes of the first two peaks in the beat histogram (2 parameters) The ratio of the amplitude of the second peak divided by the amplitude of the first peak Periods of the first and second peaks in beats per minute (2 parameters) Overall sum of the beat histogram

16 Feature Extraction The 15-dimensional is composed by nine Musical surface features ( mean and variances of spectral centroid, rolloff, flux, and zero-crossing and low–energy ) and other six related to the music rhythm

17 Outline Introduction System Overview Feature Extraction Classification Experimental Results Concluding Remarks

18 Classification Instance Based Classification Neural Network Classifier Combining Classifiers and Decision

19 Classification Instance Based Classification K-nearest neighbor algorithm assumes all instances correspond to points in the n– dimensional space The nearest neighbors of an instance are defined in terms of the standard Euclidean distance. The distance between two vectors x i and x j is denoted as d(x i, x j ) where R n

20 Classification Neural Network Classifier Based on Multilayer Perceptron (MLP) with one hidden layer

21 Classification Output layer (2 neurons) Hidden layer (8 neurons) (Input neurons + Output neurons)/2 ~ 2 * Input neurons Input layer (15 neurons) X1X1 X2X2 X3X3 X 15 Y1Y1 Y2Y2 Input Output MLP

22 Classification Output layer (2 neurons) Hidden layer (8 neurons) Input layer (15 neurons) X1X1 X2X2 X3X3 X 15 Y1Y1 Y2Y2 Input Output The network was trained using the backpropagation momentum algorithm Error

23 Sigmoidal function

24 Combination & Decision Consider only the class provide by each classifier Neglecting the confidence score associated with each class that the classifier provided Majority voting scheme

25 Outline Introduction System Overview Feature Extraction Classification Experimental Results Concluding Remarks

26 Experimental Results More than 1,000 music clips was available for the experiments The genre was assigned according to the profile of the artist the perceptual characteristics evaluated by human beings The dataset used in the experiments is composed by 414 music clips (212 from the genre rock and 212 from the gene classic) Training set : 208 samples Validation set : 82 samples Test set : 144 samples

27 A single feature vector is extracted from the middlemost region of the music clips The classifiers were trained using 208 feature vectors and tested using 122 feature vectors A validation set with 82 feature vectors was used during the training of the MLP to look over the generalization and to avoid overfitting Experimental Results(1)

28 Three feature vectors extracted from three different regions( beginning,middle and end ) of the music clips The classifiers were trained using 624 feature vectors and tested using 366 feature vectors A validation with 246 feature vectors was also used during the training of the MLP to look over the generalization and to avoid overfitting Experimental Results(2)

29 MLP K-NN

30 Experimental Results(2) Majority vote rule

31 Outline Introduction System Overview Feature Extraction Classification Experimental Results Concluding Remarks

32 A slight improvement in the correct musical genre classification was achieved Future work will include other combination strategies that take into account the confidence score provided by the classifier as well as a rejection mechanism to further improve the reliability of the system


Download ppt "Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics."

Similar presentations


Ads by Google