Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Music Search Engine for Plagiarism Detection

Similar presentations


Presentation on theme: "A Music Search Engine for Plagiarism Detection"— Presentation transcript:

1 A Music Search Engine for Plagiarism Detection
Yicong Tao Jinglin Wang

2 Motivation Detect music plagiarism in huge music dataset.
The Lakh MIDI dataset (associated with Million song dataset ) Help people capture great melody suddenly crossing their mind. Search for songs with similar melody

3 Observation about Music Plagiarism
Similar pattern appear in small sections of the whole song There are three types of plagiarism if the similar patterns are reproduced with different instruments —Sampling Plagiarism (1) with different speed —Rhythm Plagiarism (2) with tones elevated or decreased —Melody Plagiarism (3) Our focus: Melody Plagiarism Since Rhythm Plagiarism (2) can be ignored if we regularize the note speed, and Sampling plagiarism (1) is a direct combination of Rhythm Plagiarism (2) and Melody Plagiarism (3).

4 Algorithm — BLAST We mainly adapt an algorithm called Basic Local Alignment Search Tool (BLAST) BLAST is originally designed for rapid gene and protein sequence comparison in human genome. The algorithm supposes the query sequence has some segments that can perfectly match indexes in the database. It then treats these hits as the starting points and tries to extend from both sides, until the match score drops below a cut-off limit.

5 Source: http://bioinformatica.upf.edu/P13_2011/

6 Design The database should be incremental and can add new entries without rebuilding Index and music should be efficiently stored and can be quickly retrieved Index structure: file name (index key, i.e. the ‘window’), file content (a list of (MIDI_ID, position) tuples) The ‘window’ is generated by converting notes to binary representation and converting into an integer Midi file structure: file name (MIDI_ID), file content (a map, {notes: note sequence, title: song_name, and other metadata…}) The index and music database is randomly distributed among N index servers and M music servers

7 Standard Note Encoding
MIDI note number range: 0~127 MIDI octave size: 12

8 Query Score Calculation
Perfect matching: the query note number is exactly the same as the MIDI note number. Highest weight is given for this category. "Standard note" matching: the query note number is not the same as the MIDI note number, but their difference is divisible by an octave (which is 12). A moderate weight is given for this category Mismatching: any cases other than perfect matching or "standard note" matching is mismatching. Apenalty is given for this category.

9 Demonstration Just do it!


Download ppt "A Music Search Engine for Plagiarism Detection"

Similar presentations


Ads by Google