Presentation is loading. Please wait.

Presentation is loading. Please wait.

Centre for Computational Creativity Semantic Audio Studio Tools and Techniques using MPEG-7 Dr. Michael Casey Centre for Computational Creativity Department.

Similar presentations


Presentation on theme: "Centre for Computational Creativity Semantic Audio Studio Tools and Techniques using MPEG-7 Dr. Michael Casey Centre for Computational Creativity Department."— Presentation transcript:

1 Centre for Computational Creativity Semantic Audio Studio Tools and Techniques using MPEG-7 Dr. Michael Casey Centre for Computational Creativity Department of Computing City University, London

2 Centre for Computational Creativity Overview MPEG-7 Tools Low Level Audio Descriptors Statistical Sound Models (Semantic ?) Music Unmixing Independent Spectrogram Separation Sound Classification Automatic label extraction “Semantic” processing Segment Similarity, Structure Extraction Musaics S-Matrix (Self-Similarity Matrix) C-Matrix (Cross-Similarity Matrix) Segment Replacement Musaics

3 Centre for Computational Creativity Semantic Audio Analysis Acoustic Features Extraction Semantic Audio Description

4 Centre for Computational Creativity MPEG-7 Audio Descriptors Header

5 Centre for Computational Creativity MPEG-7 Audio Descriptors Segments

6 Centre for Computational Creativity MPEG-7 Audio Descriptors Descriptor

7 Centre for Computational Creativity Some Useful Descriptors for Music Processing AudioSpectrumEnvelopeD AudioSpectrumBasisD AudioSpectrumProjectionD SoundModelDS SoundModelStatePathD SoundModelStateHistogramD

8 Centre for Computational Creativity EXAMPLE 1 MUSIC UNMIXING

9 Centre for Computational Creativity AudioSpectrumBasisD

10 Centre for Computational Creativity AudioSpectrumBasisD SVD / ICA Basis Rotation AudioSpectrumProjectionD AudioSpectrumBasisD

11 Centre for Computational Creativity AudioSpectrumBasisD

12 Centre for Computational Creativity AudioSpectrumProjectionD SVD / ICA Basis Rotation AudioSpectrumProjectionD AudioSpectrumBasisD

13 Centre for Computational Creativity AudioSpectrumProjectionD

14 Centre for Computational Creativity Outer Product Spectrum Reconstruction Individual Basis Component

15 Centre for Computational Creativity 4 Component Reconstruction

16 Centre for Computational Creativity 10 Component Reconstruction

17 Centre for Computational Creativity Linear basis projection using SVD and ICA spectrum subspace separation fast computation of subspace ICA full-rate filterbank masking Blocked ICA functions subspace reconstruction Y = XVV cluster subspaces to identify “tracks” sum masked filterbank output to create audio Music Unmixing + j jj

18 Centre for Computational Creativity 1 Component 4 Components 10 Components Subspace Extraction Mixture Spectrogram Independent Spectrogram Subspace Layers Spectral Basis Time Function Spectrogram Layer

19 Centre for Computational Creativity Music Unmixing Example (Pink Floyd: mono -> 9 subspace tracks)

20 Centre for Computational Creativity EXAMPLE 2 AUTOMATIC AUDIO CLASSIFICATION

21 Centre for Computational Creativity Sound Model DS and related descriptors 1 3 3 2 2 3 4 4 4 4... 1 23 4 ContinuousHiddenMarkovModelDS SoundModelStatePathD AudioSpectrumBasisD T(i,j) x AudioSpectrumEnvelopeD AudioSpectrumProjectionD

22 Centre for Computational Creativity Sound Recognition using HMMs Trained HMMs Sound Database

23 Centre for Computational Creativity MPEG-7: Intelligent Music Browsing

24 Centre for Computational Creativity Music Genre Classification: Class Name Num of Files Num Segments 1) Blues 79 86 2) hiphop 15 129 3) Gospel 23 25 4) Country 27 28 5) DrumNBass 26 275 6) Classical 8 156 7) 2Step 39 311 8) Merengue 34 304 9) Reggae 80 398 10) Salsa 39 425 ------------------------------------------- Totals 370 2137

25 Centre for Computational Creativity Music Genre Classification

26 Centre for Computational Creativity Semantic Audio: General Sound Taxonomy

27 Centre for Computational Creativity DS: General Audio Classification

28 Centre for Computational Creativity EXAMPLE 3 STRUCTURE EXTRACTION

29 Centre for Computational Creativity Structure Discovery Acoustic Features State-Space Models Hierarchical Structure Discovery

30 Centre for Computational Creativity SoundModelStatePathD State Path A simplified representation of spectral dynamics

31 Centre for Computational Creativity SoundModelStateHistogramD seconds state index 0.01s Frames

32 Centre for Computational Creativity High-Level Structure Discovery

33 Centre for Computational Creativity S-Matrix

34 Centre for Computational Creativity STRUCTURE EXTRACTION == SEGMENTATION

35 Centre for Computational Creativity Structure Discovery Low level features High-level Structure Acoustic Features State-Space Models Hierarchical Structure Discovery

36 Centre for Computational Creativity Alanis Morrisette Human Segmentation Machine Segmentation High-Level Structure Discovery

37 Centre for Computational Creativity Cranberries Human Segmentation Machine Segmentation High-Level Structure Discovery

38 Centre for Computational Creativity Nirvana Human Segmentation Machine Segmentation High-Level Structure Discovery

39 Centre for Computational Creativity High-Level Structure Discovery

40 Centre for Computational Creativity EXAMPLE 4 MUSAICS

41 Centre for Computational Creativity Musaics ( Music Mosaics) C-Matrix : Cross-Song Similarity Matrix Outer product of target and source histograms Find segments similar to target segment Similarity between all target and database segments SORT columns of similarity matrix Replace segments with similar material Segmentation boundaries (beat alignment) Replace with “best fit” using DTW on most similar segments EXAMPLES

42 Centre for Computational Creativity Musaics Target Extract MPEG-7 Database StatePathHistograms Segment Beats Match Replace Musaic

43 Centre for Computational Creativity Musaics

44 Centre for Computational Creativity Musaics

45 Centre for Computational Creativity Musaics

46 Centre for Computational Creativity Musaics

47 Centre for Computational Creativity Musaics

48 Centre for Computational Creativity Musaics

49 Centre for Computational Creativity Musaics

50 Centre for Computational Creativity Musaics

51 Centre for Computational Creativity Musaics

52 Centre for Computational Creativity Musaics

53 Centre for Computational Creativity Musaics

54 Centre for Computational Creativity Musaics New Content by Similarity Replacement C-Matrix: Cross-Song Similarity Map 1 Target, Many Sources Constraints Preserve Rhythm by Beat Tracking Preserve Beats by DTW alignment Bigger Source Database == Better Greater Number of Accurate Matches

55 Centre for Computational Creativity Acknowledgements International Standards Organisation ISO/IEC JTC 1 SC29 WG11 (MPEG) Mitsubishi Electric Research Labs Massachusetts Institute of Technology Music Mind Machine Group (formerly Machine Listening Group) Paris Smaragdis, Youngmoo Kim, Brian Whitman Iroro Orife, John Hershey, Alex Westner, Kevin Wilson City University Department of Computing Centre for Computational Creativity


Download ppt "Centre for Computational Creativity Semantic Audio Studio Tools and Techniques using MPEG-7 Dr. Michael Casey Centre for Computational Creativity Department."

Similar presentations


Ads by Google