Algorithms for pattern discovery and pitch spelling in music David Meredith Goldsmiths College University of London.

Slides:

Advertisements

Similar presentations

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.

Advertisements

Algorithms for pattern matching and pattern discovery in music David Meredith Aalborg University.

Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.

Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.

Fast Algorithms For Hierarchical Range Histogram Constructions

Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.

Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.

Music Processing Algorithms David Meredith Department of Media Technology Aalborg University.

Point-set algorithms for pattern discovery and pattern matching in music David Meredith Goldsmiths College University of London.

Pitch-spelling algorithms David Meredith Aalborg University.

DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December

Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.

Mapping MIDI to the Spiral Array: Disambiguating Pitch Spelling Elaine CHEW Yun-Ching CHEN.

Tree structured representation of music for polyphonic music information retrieval David Rizo Departament of Software and Computing Systems University.

Automatic Pitch Spelling Xiaodan Wu Feb Presentation From Numbers to Sharps and Flats Emilios Cambouropoulos.

Learning to Align Polyphonic Music. Slide 1 Learning to Align Polyphonic Music Shai Shalev-Shwartz Hebrew University, Jerusalem Joint work with Yoram.

Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.

Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.

1 Validation and Verification of Simulation Models.

Overview of Search Engines

Information Retrieval in Practice

©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

JSymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada.

Polyphonic Queries A Review of Recent Research by Cory Mckay.

A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.

Music Processing Algorithms David Meredith. Recent projects Musical pattern matching and discovery Finding occurrences of a query pattern in a work Finding.

Introduction to algorithmic models of music cognition David Meredith Aalborg University.

HANA HARRISON CSE 435 NOVEMBER 19, 2012 Music Composition.

David Temperley Presentation by Carley Tanoue

R ESEARCH BY E LAINE C HEW AND C HING -H UA C HUAN U NIVERSITY OF S OUTHERN C ALIFORNIA P RESENTATION BY S EAN S WEENEY D IGI P EN I NSTITUTE OF T ECHNOLOGY.

Gapped BLAST and PSI- BLAST: a new generation of protein database search programs By Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui.

Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.

Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.

Aspects of Music Information Retrieval Will Meurer School of Information University of Texas.

Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.

1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,

1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.

Audio Thumbnailing of Popular Music Using Chroma-Based Representations Matt Williamson Chris Scharf Implementation based on: IEEE Transactions on Multimedia,

Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.

Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Images Similarity by Relative Dynamic Programming M. Sc. thesis by Ady Ecker Supervisor: prof. Shimon Ullman.

Fast Approximate Point Set Matching for Information Retrieval Raphaël Clifford and Benjamin Sach

Voice Separation-A Local Optimisation Approach Jurgen Kilian Department of Computer Science Darmstadt University of technology Holger H.Hoos Department.

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser.

Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.

UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.

A Compression-Based Model of Musical Learning David Meredith DMRN+7, Queen Mary University of London, 18 December 2012.

Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.

Discovering Musical Patterns through Perceptive Heuristics By Oliver Lartillot Presentation by Ananda Jacobs.

Alex Stabile. Research Questions: Could a computer learn to distinguish between different composers? Why does music by different composers even sound.

Tree structured and combined methods for comparing metered polyphonic music Kjell Lëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008.

1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.

Rhythmic Transcription of MIDI Signals

Indexing & querying text

Geometric Pattern Discovery in Music

Aspects of Music Information Retrieval

Memory and Melodic Density : A Model for Melody Segmentation

Objective of This Course

Presented by Steven Lewis

EE513 Audio Signals and Systems

Integrating Segmentation and Similarity in Melodic Analysis

Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

Pitch Spelling Algorithms

Measuring the Similarity of Rhythmic Patterns

Presentation transcript:

Algorithms for pattern discovery and pitch spelling in music David Meredith Goldsmiths College University of London

Overview of Research Interests Music information retrieval Managing musical data and retrieving useful information from it Automatic music transcription Computing a score from a recording of a musical passage Computational music cognition and analysis Constructing computational models that extract structures that listeners hear and music analysts find interesting Evaluation sound methodologies and “gold-standard” test collections

Musical pattern discovery and pitch spelling Musical pattern discovery Finding themes and other perceptually important repeated patterns Useful for indexing in music information retrieval Pitch spelling Predicting the pitch names (e.g., C#4, of notes in a “piano-roll” representation (e.g., MIDI) Essential for transcribing music from MIDI (or audio) to notation

Uses of musical pattern discovery algorithms Indexing Store themes, motives and other memorable patterns in index to enable sub-linear retrieval times Transcription and music analysis Beat tracking and metrical structure analysis - similar patterns have similar metrical structure Grouping and phrasing - “parallellism” (Lerdahl and Jackendoff, 1983) most important factor in grouping Composer’s assistant, automatic improvisation Cure composer’s block by suggesting new material based on patterns discovered in music already written Automatically create new music that develops themes discovered in music already played

Importance of repeated patterns in music analysis and cognition Schenker (1954. p.5): repetition “is the basis of music as an art” Bent and Drabkin (1987, p.5): “the central act” in all forms of music analysis is “the test for identity” Lerdahl and Jackendoff (1983, p.52): “the importance of parallelism [i.e., repetition] in musical structure cannot be overestimated. The more parallelism one can detect, the more internally coherent an analysis becomes, and the less independent information must be processed and retained in hearing or remembering a piece”

Most musical repetitions are neither perceived nor intended

Interesting musical repetitions are structurally diverse Want to discover all and only interesting repeated patterns Class of interesting repeated patterns is structurally diverse because patterns vary widely in structural characteristics many ways of transforming a musical pattern to give another pattern that is perceived to be a version of it e.g., truncated, augmented, diminished, inverted, embellished and even reversed

Example of repeated motive

Example of thematic transformation

String-based algorithms for discovering musical patterns Most previous approaches assume music represented as strings each string represents a voice or part each character represents a note or an interval between two consecutive notes in a voice Similarity between two patterns measured in terms of edit distance calculated using dynamic programming see, e.g., Lemstrom (2000), Hsu et al. (1998), Rolland (1999)

Problems with the string-based approach - Edit distance B is an embellished version of A If both patterns represented as strings each symbol represents pitch of note then edit distance between A and B is 9 If allow pattern with 9 differences to count as a match, then get many spurious hits

Problems with string-based approach - Polyphony If searching polyphonic music and do not know voice to which each note belongs (e.g., MIDI format 0 file); or interested in patterns containing notes from 2 or more voices then combinatorial explosion in number of possible string representations if don’t use all possible representations then may not find all interesting patterns

Using multidimensional point sets to represent music (1)

Using multidimensional point sets to represent music (2)

SIA - Discovering all maximal translatable patterns (MTPs) Pattern is translatable by vector v in dataset if it can be translated by v to give another pattern in the dataset MTP for a vector v contains all points mapped by v onto other points in the dataset O(kn 2 log n) time, O(kn 2 ) space O(kn 2 ) average time with hashing (Lemstrom)

SIATEC - Discovering all occurrences of all MTPs

Absolute running times of SIA and SIATEC SIA and SIATEC implemented in C run on a 500MHz Sparc on 52 datasets (6≤n≤3456, 2≤k≤5) < 2 mins for SIA to process piece with 3500 notes 13 mins for SIATEC to process piece with 2000 notes

Need for heuristics to isolate interesting MTPs 2 n patterns in a dataset of size n SIA generates < n 2 /2 patterns => SIA generates small fraction of all patterns in a dataset Many interesting patterns derivable from patterns found by SIA BUT many of the patterns found by SIA are NOT interesting 70,000 patterns found by SIA in Rachmaninoff’s Prelude in C# minor probably about 100 are interesting => Need heuristics for isolating interesting patterns in output of SIA and SIATEC

Heuristics for isolating musical themes and motives Cov=6 CR=6/5 Cov=9 CR=9/5 Comp = 1/3Comp = 2/5Comp = 2/3

COSIATEC - Data compression using SIATEC Start Dataset SIATEC List of pairs Print out best pattern, P, and its translators Remove occurrences of P from dataset Is dataset empty? End No Yes

Using COSIATEC for finding themes and motives in music First iterationSecond iteration

SIAM - Pattern matching using SIA O(knm log(nm)) time O(knm) space O(knm) average time with hashing Query pattern Dataset

Improving SIAM - Ukkonen, Lemström & Mäkinen (2003) Use sweepline-like scanning of the dataset (Bentley and Ottmann, 1979) Generalized to approximate matching of sets of horizontal line-segments Improved running time to O(mn log m) (without hashing) and working space to O(m) Implemented as algorithm P2 on C-BRAHMS demo web site

Improving SIAM - Clifford (In preparation) Finds best match in O(n log n) time Reduce problem to one dimension by randomised projection Reduce length of problem by uniform hashing Perform pattern matching using FFTs Find best match and check in O(m) time exactly how many points match at the location that can be inferred from this match

Pitch spelling algorithms (1)

Pitch spelling algorithms (2)

Pitch spelling in tonal music Pitch name depends on harmonic structure and voice-leading structure Pitch name chosen so that score correctly represents the way the music is intended to be perceived and interpreted (Piston, 1978, p.8)

Comparative analysis of pitch spelling algorithms Algorithms analysed, evaluated and (in some cases) improved Longuet-Higgins (1976, 1987, 1993) Cambouropoulos (1996,1998, 2001, 2003) Temperley (2001) Chew and Chen (2003, 2005) Meredith (2003, 2005, 2006) Test corpus notes, 216 movements, 8 baroque and classical composers almost exactly equal number of notes (24500) for each composer

Evaluation criteria and performance metrics Evaluation criteria Spelling accuracy - how well an algorithm predicts the pitch names Style dependence - how much spelling accuracy depends on style Performance metrics Note accuracy - proportion of notes in corpus spelt correctly Style dependence - standard deviation of note accuracies over 8 composers Robustness to temporal deviations Best versions of algorithms also run on version of test corpus in which onsets and durations were randomly adjusted

Longuet-Higgins’s algorithm (1976,1987,1993) Uses 6 rules to predict pitch names Rule 1: pitch names as close to tonic on line of fifths Rules 2-6: deal with chromatic intervals and key changes Rule 2 incorrectly implemented in music.p 6 versions of algorithm tested Original and two versions with Rule 2 “corrected” Same three algorithms with pitch names not restricted to being between G double sharp and A double flat Two versions of test corpus Voices arranged “end-to-end” (should be better) Voices “interleaved” with notes sorted by onset and pitch

Longuet-Higgins’s algorithm - Results Correcting Rule 2 implementation lowered note accuracy Made half as many errors when voices end-to-end Allowing pitch names to be anywhere on the line of fifths doubled number of errors Original version performed best (NA = 98.21%; SD = 1.79)

Cambouropoulos’s algorithm (1996,1998,2001,2003) Three published versions of algorithm Input changed to sequence of MIDI note numbers Shifting overlapping window improves running time and avoids boundary errors Computes all spellings for each window 128 spellings for each 9-note window Spelling penalised if contains intervals that are rare in tonal scales contains double sharps or double flats

Cambouropoulos’s algorithm - Evaluation 18 ways in which two versions of the algorithm could differ e.g., variable or fixed length window 26 versions implemented and tested goal to estimate optimal combination of variable features Window: Variable-length better than fixed-length Best variable-length window version: NA = 99.07%; SD = 0.46 Increasing window size increases accuracy but exponentially increases running time 12 note window is practical maximum Algorithm with ‘optimal’ combination of features: NA = 99.15%; SD = 0.47

Temperley and Sleator’s pitch spelling algorithm (2001)

Temperley and Sleator’s algorithm - Evaluation Output of meter program depends on tempo System tested on 6 versions of corpus, each with different tempo Best on natural tempo or half-speed corpora NA = 99.30%; SD = 1.13 (without enh. change) NA = 97.79%; SD = 4.57 (with enh. change) Highly sensitive to tempo at 4 times natural tempo, NA = 74.58% worse than just spelling all black notes randomly as either sharp or flat! Simple implementation of TPR 1 alone achieved NA = 99.04%; SD = 0.65

Chew and Chen’s algorithm (2003,2005) Based on “spiral array” = line of fifths coiled up Tonic represented by center of effect = Centroid of positions in spiral array of pitch names in preceding window First spelt so close to global CE, then re-spelt so close to weighted average of local and cumulative CEs

Chew and Chen’s algorithm - Evaluation New implementation allows user to use line of fifths instead of spiral array consider notes starting in each window instead of notes sounding in each window when computing CEs change aspect ratio of spiral array Run 1296 times on test corpus, each time with different parameter value combination Best 12 versions scored NA=99.15%, SD=0.4 worked best when all three CEs used, local and cumulative CEs weighted equally and chunks small Line of fifths worked just as well as the spiral array

PS13s1 (Meredith, 2003,2005,2006) Pitch name implied by a tonic is one that is closest to the tonic on the line of fifths Strength with which tonic implied proportional to frequency of occurrence Strength with which pitch name implied proportional to sum of frequencies of occurrence of tonics implying pitch name

PS13s1 - Results Takes two parameters: Precontext (K pre ): number of notes preceding note to be spelt included in context Postcontext (K post ): number of notes following note to be spelt included in context PS13 run with all values of K pre and K post between 1 and 50 PS13s1 run with 17 best values obtained with PS13 Made 15-19% fewer errors than PS13 for these parameter values Some results: NA = 99.44%, SD = 0.49 (K pre =10, K post =42) NA = 99.44%, SD = 0.45 (K pre =33, K post =25) NA = 99.19%, SD = 0.51 (K pre =40, K post =1)

Summary of pitch spelling results AlgorithmClean corpusNoisy corpus NA%SDNA%SD PS13s1 x Temperley* Chew and Chen Cambouropoulos Longuet-Higgins § x K pre = 33, K post = 25 *Two-pass, natural tempo corpus, without enh. change + New optimized versions § Only when music processed a voice at a time.

Future work Pattern discovery and pattern matching Compare SIA algorithms with methods developed in other more mature fields (e.g., computer vision, graph matching) Improve time complexity of SIA algorithms with advanced algorithmic techniques (e.g., randomized projection, hashing) Adapt algorithms for approximate matching and scaling (matching at different tempi) Adapt SIA and SIATEC for early pruning of uninteresting patterns Pitch spelling Incorporate PS13s1 into complete MIDI-to-notation transcription system Use PS13s1 for key-tracking and harmonic analysis Use PS13s1 for feature extraction on audio data

Acknowledgements and further details Thanks to Chris Bishop and Stephen Robertson for inviting me to give a talk Geraint Wiggins for suggesting SIAM Kjell Lemstrom for developing SIAM further Raphael Clifford for developing SIAM further still EPSRC for funding GR/S17253/02, GR/N08049/01 Further details: