Catherine Lai MUMT-611 MIR February 17, 2005

Catherine Lai MUMT-611 MIR February 17, 2005
Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17, 2005 /14 1

Outline of Presentation
Introduction transcription of polyphonic music targeted on specific instruments Current state-of-the-art: various approaches Recent published piano transcription systems Dixon, 2000 Raphael, 2002 Monti and Sandler, 2002 Marolt, 2004 Discussion and Conclusion Links to examples of transcription of piano music recordings Bibliography /14 2

Introduction Transcription of polyphonic music
acoustical waveform --> parametric representation extract pitches, starting times, durations First attempt by Moorer, 1975 note range limitation two voices constraint Martin, 1996 piano transcription system up to four voices chorale style of J.S. Bach (long durations with block chords) Future systems tackled limitations targeted system on specific instruments Focus of this literature review: automated transcription of polyphonic piano music /14 3

Current State-of-the-Art: Various Approaches
Automated transcription of polyphonic piano music input: audio files containing polyphonic piano music output: MIDI representing pitch, timing, volume Simon Dixon, “On the Computer Recognition of Solo Piano Music” standard SP, adaptive peak-picking, pattern matching Christopher Raphael, “Automatic Transcription of Piano Music” HMM Monti and Sandler, “Automatic Polyphonic Piano Note Extraction Using Fussy Logic in a Blackboard System” blackboard algorithm Matija Marolt, 2004 “A connectionist approach to automatic transcription of polyphonic music” neural network models /14 4

Published Piano Transcription System Simon Dixon, 2000
Published Piano Transcription System Simon Dixon, “On the computer recognition of solo piano music” [standardized SP approach] 1st processing stage low-filtering --> down-sampling signal (12kHz) Time-frequency representation STFT --> power spectrum --> spectral peak extraction (local maxima > threshold, adaptive peak-picking algorithm) frequency tracks --> grouping partials --> musical notes Evaluation: 13 Mozart piano sonata performed by a concert pianist Bösendorfer SE290 computer-monitored piano --> MIDI Results: N=no. correctly i.d. notes; FP=no. note reported not played; FN=no. notes played not reported by system ; incorrectly I.d. note = FP and FN score = N/(FP + FN + N) recognition accuracy of 70-80% Future development: accuracy of dynamic and offset times /14 5

Published Piano Transcription System: Christopher Raphael, 2002
Published Piano Transcription System: Christopher Raphael, “Automatic transcription of piano music” [HMM] HMM- trained likelihood model statistical pattern recognition and machine learning for structures Process segment signal to frames; extract features (vector) from frames; assign label for content description Precise vector features total energy (play or silent) local “burstiness” (attack, steady behavior) pitch configuration Label sound pitches collection and re-articulation (attack, sustain, rest) Model setup hidden process (label process); observable process (feature vector) generate reasonable hypotheses for each frame and construct search graph of the hypotheses /14 6

Published Piano Transcription System: Christopher Raphael, 2002
Published Piano Transcription System: Christopher Raphael, “Automatic transcription of piano music” [HMM] Experiment Mozart piano sonata limitations on range (c two octave below middle c to the f to two and a half octave above middle c) number of voices 4 or less Evaluation borrowed from speech evaluation of “Word Error Recognition Rate” Error Rate = 100 * (Insertions + Deletions + Substitutions) / (Total Words in Truth Sentence) preliminary results have a “Note Error Rate” of 39% 184 substitutions, 241 deletions, 108 insertions out of 1360 notes Future improvement simple additions may yield better results likelihood of chord sequence informative note onsets acoustic cues /14 7

Published Piano Transcription System: Monti and Sandler 2002
Published Piano Transcription System: Monti and Sandler “Automatic polyphonic piano note extraction using fussy logic in a blackboard system” [Blackboard algorithm] Implementation Polyphonic Note Recognition using a Fuzzy Inference System (FIS) as part of the Knowledge Sources (KSs) in a Blackboard system Blackboard model arrangement hierarchy of data abstraction level KSs dictate advancement and is activated by Scheduler FIS take spectral peaks not selected create new Note Candidates evaluate Candidate by features fundamental of note harmonic rate difference bt max peak in spectrum and Candidate’s fundamental energy Blackboard system (Monti and Sandler, 2002) /14 8

score = N/(FP + FN + N) Dixon’s detection success rate = 45% correct
Published Piano Transcription System: Monti and Sandler “Automatic polyphonic piano note extraction using fussy logic in a blackboard system” [Blackboard algorithm] Evaluation 14 piano pieces by various composer including Beethoven, Mozart, Debussy, Ravel, and Scarlatti Results N=correctly i.d. notes; FP=note not played; FN=notes not reported by sys score = N/(FP + FN + N) Dixon’s detection success rate = 45% correct 75% = correctly detected note / total transcribed notes /14 9

Published Piano Transcription System: Matija Marolt, 2004 “A connectionist approach to automatic transcription of polyphonic piano music.” [Neural networks approach] New model based on networks of adaptive oscillators was proposed and implemented in SONIC to partial tracking and note recognition 5.adaptive oscillators try to synchronize to signals in output freq channels of the auditory model by adjusting its phase and frequency 6. When synchronized to the output freq indicate the freq is periodic and a partial with feq sim to filter present 1. acoustical waveform -->time-feq space with an auditory model 2. auditory model output set of freq channel 3. periodicity in frequency channels is related to pitch perception 4. use adaptive oscillators to calculate periodicity in frequency channels 76 neural networks; others tested multilayer perception, radial basis function, etc. Marolt, 2004 /14 10

Published Piano Transcription System: Matija Marolt, 2004 “A connectionist approach to automatic transcription of polyphonic piano music.” [Neural networks approach] Evaluation tested on synthesized and real recordings of various genre Results synthesized recoding around 90% of all notes real recording results not as good (not available) most common error (> 50%) octaves and rapidly played notes (e.g.arpeggios, trills) greatest challenge very expressive playing Chopin’s Nocturnes quiet and almost inaudible left hand Further Development detecting repeated notes Marolt, 2004 /14 11

Discussion and Conclusion
Various approaches proposed standard S.P. techniques; HMM; blackboard algorithm; neural networks Common mistakes octave, rapid passages, and quiet notes Difficulties lack standard set of test examples evaluation function various constraints and formula -- > comparison difficult Piano transcription system Performance results Dixon 70-80% correct SONIC 80-95% correct Raphael 39% wrong Monti and Sandler 74% correct /14 12

Links to examples of transcription of piano music recordings
(Marolt) (Dixon) /14 13

Bibliography Dixon, S On the Computer Recognition of Solo Piano Music. Australasian Computer Music Conference Marolt, M A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 6, no. 3 (June): Martin, K A blackboard system for automatic transcription of simple polyphonic music. MIT Media Laboratory Perceptual Computing Section Technical Report No. 385. Montipi, G, and M. Sandler Automatic Polyphonic Piano Note Extraction Using Fuzzy Logic in a Blackboard System. Proceedings of the International Conference on Digital Audio Effects Moorer, J On the segmentation and analysis of continuous musical sound by digital computer. Ph.D. thesis, Stanford University, CCRMA. Raphael, C Automatic Transcription of Piano Music. Proceedings of the International Conference on Music Information Retrieval. /14 14

Catherine Lai MUMT-611 MIR February 17, 2005

Similar presentations

Presentation on theme: "Catherine Lai MUMT-611 MIR February 17, 2005"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Catherine Lai MUMT-611 MIR February 17, 2005

Similar presentations

Presentation on theme: "Catherine Lai MUMT-611 MIR February 17, 2005"— Presentation transcript:

Similar presentations

About project

Feedback