Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optical Music Recognition Ichiro Fujinaga McGill University 2003.

Similar presentations


Presentation on theme: "Optical Music Recognition Ichiro Fujinaga McGill University 2003."— Presentation transcript:

1 Optical Music Recognition Ichiro Fujinaga McGill University 2003

2 Content  Optical Music Recognition  Levy Project  Levy Sheet Music Collection  Digital Workflow Management  Gamera  Guido / NoteAbility

3 Optical Music Recognition (OMR)  Trainable open-source OMR system in development since 1984  Staff recognition and removal Run-length coding Projections  Lyric removal / classifier  Stems and notehead removal  Music symbol classifier  Score reconstruction Demo

4 OMR: Classifier  Connected-component analysis  Feature extraction, e.g:  Width, height, aspect ratio  Number of holes  Central moments  k-nearest neighbor classifier  Genetic algorithm

5 Overall Architecture for OMR Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Knowledge Base Feature Vectors Optimization Genetic Algorithm K-nn Classifier Best Weight Vector Image File Off-line

6 Lester S. Levy Collection

7  North American sheet music (1780–1960)  Digitized 29,000 pieces  including “The Star-Spangle Banner” and “Yankee Doodle”  Database of:  text index records  images of music (8bit gray)  lyrics (first lines of verse and chorus)  color images of cover sheets (32bit) http://levysheetmusic.mse.jhu.edu http://levysheetmusic.mse.jhu.edu

8  Reduce the manual intervention for large-scale digitization projects  Creation of data repository (text, image, sound)  Optical Music Recognition (OMR)  Gamera  XML-based metadata  composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher  cross-references for various forms of names, pseudonyms  authoritative versions of names and subject terms  Music and lyric search engines  Analysis toolkit Digital Workflow Management

9 The problem  Suitable OCR for lyrics not found  Commercial OCR systems are often inadequate for non-standard documents  The market for specialized recognition of historical documents is very small  Researchers performing document recognition often “re-invent” the basic image processing wheel

10 The solution  Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications  Generalize OMR for structured documents

11 Introducing Gamera  Framework for creation of structured document recognition system  Designed for domain experts  Image processing tools (filters, binarizations, etc.)  Document segmentation and analysis  Symbol segmentation and classification Feature extraction and selection Classifier selection and combiners  Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives

12 Features of Gamera  Portability (Unix, Windows, Mac)  Extensibility (Python and C++ plugins)  Easy-to-use (experts and programmers)  Open source  Graphic User Interface  Interactive / Batchable (scripts)

13 Graphic User Interface (wxWindows) Architecture of Gamera GAMERA Core (C++) Scripting Environment (Python) Plugins (Python) Automatic Plugin Wrapper (Boost) Plugins (C++)

14 Example of C++ Plugin // Number of pixels in matrix #include “gamera.hh” #ifdef __area_wrap__ #define NARGS 1 #define ARG1_ONEBIT #endif using namespace Gamera; template feature_t area(T &m) { return feature_t(m.nrows() * m.ncols()); }

15 Example of Python Plugin // This filters a list of CC objects import gamera def filter_wide(ccs, max_width): tmp = [] for x in ccs: if x.ncols() > max_width: x.fill_matrix(0) else: tmp.append(x) return tmp

16 Gamera: Interface (screenshot in Linux)

17

18 Histogram (screenshot in Linux)

19 Thresholding (screenshot in Linux)

20

21 Staff removal: Lute tablature

22

23 Classifier: Lute (screenshot in Linux)

24 Staff removal: Neums

25 Classifier: Neums (screenshot in Linux)

26 Greek example

27 GUIDO Music Notation Format H. Hoos, K. Renz, J. Kilian  “A formal language for score-level representation”  Plain text: readable, platform independent  Extensible and flexible  Adequate representation  NoteServer: Web/Windows  GUIDO/XML  NoteAbility (K. Hamel)

28 GUIDO: An example { [ \beamsOff | \clef \key f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8. g*1/16 | c#2*1/4. b1*1/8 a*1/4. g*1/8 | | e#*1/2 f#*1/4 f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8 g | c#2*1/4. b1*1/8 a*1/4. c#*1/8 ], …

29

30 Conclusions  Gamera allows rapid development of domain-specific document recognition applications  Domain experts can customize and control all aspects of the recognition process  Includes an easy-to-use interactive environment for experimentation  Beta version available on Linux  OS X version in preparation

31 Projections X-projections Y-projections back


Download ppt "Optical Music Recognition Ichiro Fujinaga McGill University 2003."

Similar presentations


Ads by Google