Presentation on theme: "A Musical Data Mining Primer CS235 – Spring ’03 Dan Berger"— Presentation transcript:
A Musical Data Mining Primer CS235 – Spring ’03 Dan Berger
Outline Motivation/Problem Overview Background Types of Music Digital Representations Psychoacoustics Query (Content vs. Meta-Data) Categorization & Clustering Finding More Conclusion
Motivation More music is being stored digitally: PressPlay offers 300,000 tracks for download As collections grow – organizing and searching manually become hard; How to find the “right” music in a sea of possibilities? How to find new artists given current preferences? How to find a song you heard on the radio?
Problem Overview Music is a highly dimension time series: 5 CD quality > 13M samples! It seems logical to apply data mining and IR techniques to this form of information. Query, Clustering, Prediction, etc. Application isn’t straightforward for reasons we’ll discuss shortly.
Background: Types of Music Monophonic: one note sounds at a time. Homophonic: multiple note sound – all starting (and ending) at the same instant. Polyphonic: no constraints on concurrency. Most general – and difficult to handle.
Background: Digital Representations Structured (Symbolic): MIDI – stores note duration & intensity, instructions for a synthesizer Unstructured (Sampled): PCM – stores quantized periodic samples Leverages Nyquist/Shannon’s sampling thm. to faithfully capture the signal. MP3/Vorbis/AAC – discards “useless” information – reduces storage and fidelity Use psychoacoustics Some work at rediscovering musical structure.
Background: Psychoacoustics Two main relevant results: Limited, freq. dependant resolution Auditory masking We hear different frequencies differently: sound spectrum broken into “critical bands” We “miss” signals due to spectral &/or temporal “collision.” Loud sounds mask softer ones, Two sounds of similar frequency get blended
Query – Content is King Current systems use textual meta-data to facilitate query: Song/Album Title, Artist, Genre* The goal is to query by the musical content: Similarity ‘find songs “like” the current one’ ‘find songs “with” this musical phrase’
Result: Query By Humming A handful of research systems have been built that locate songs in a collection based on the user humming or singing a melodic portion of the song. Typically search over a collection of monophonic MIDI files.
Content Based Query Recall: music is a time series with high dimensionality. Need robust dimensionality reduction. Not all parts of music are equally important. Feature extraction – remember the important features. Which features are important?
Similarity/Feature Extraction The current “hard problem” – there are ad- hoc solutions, but little supporting theory. Tempo (bpm), volume, spectral qualities, transitions, etc. Sound source: is it a piano? a trumpet? Singer recognition: who’s the vocalist? Collectively: “Machine Listening” These are hard problems with some positive results.
Compression Complexity Different compression schemes (MP3/Vorbis/AAC) use psychoacoustics differently. Different implementations of a scheme may also! Feature extraction needs to be robust to these variations. Seems to be an open problem.
Categorization/Clustering Genre (rock/r&B/pop/jazz/blues/etc.) is manually assigned – and subjective. Work is being done on automatic classification and clustering. Relies on (and sometimes reinvents) the similarity metric work described previously.
Browsing & Visualization: LOUD: physical exploration Islands of Music: uses self organizing maps to visualize clusters of similar songs.
Current Efforts Amazon/iTunes/etc. use collaborative filtering. If the population is myopic and predictable, it works well, otherwise not. Hit Song Science – clusters a provided set of songs against a database of top 30 hits to predict success. Claims to have predicted the success of Nora Jones. Relatable – musical “fingerprint” technology – involved with “Napster 2”
Finding More Conferences: Int. Symposium on Music IR (ISMIR) Int. Conference on Music and AI (ICMAI) Joint Conference on Digital Libraries Journals: ACM/IEEE Multimedia Groups: MIT Media Lab: Machine Listening Group
Conclusion Slow steady progress is being made. “Music Appreciation” is fuzzy we can’t define it but we know it when we hear it. References, and more detail, are in my survey paper, available shortly on the web.