Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen.

Similar presentations


Presentation on theme: "Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen."— Presentation transcript:

1 Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen

2 Outline 1 Background of the Project: –Euregio Bodensee - Library Cooperation –Project AGI and VLB = Vorarlberger Landesbibliothek –IBH = Internationale Bodenseehochschule 2 Project Partners: –AGI: http://www.agi-imc.de/ –Libraries

3 Outline 3 Project Tools: –intelligentCAPTURE –IC CAI-Engine –intelligentSEARCH 4 Project Results: –Library catalogue: http://www.vorarlberg.at/vlb/http://www.vorarlberg.at/vlb/ –Portal: http://www.dandelon.com

4 1 Background Euregio Bodensee - Region extending for roughly 50km around Lake Constance (Bodensee) - Covers the southern German districts of Konstanz, Sigmaringen, Ravensburg, Lindau, and Oberallgäu und Bodenseekreis - Austrian province of Vorarlberg - Swiss cantons of St. Gallen, Schaffhausen, Appenzell- Innerrhoden and Appenzell-Ausserrhoden - Principality of Liechtenstein.

5 1 Background Euregio Bodensee - Library Cooperation http://www.ub.uni-konstanz.de/euregio/bodkat.htm http://www.ub.uni-konstanz.de/boddb/

6 1 Background IBH = Internationale Bodensee-Hochschule International Lake Constance University - Virtual University - Network of 24 independent universities - Aim: promote cooperation among member universities in fields of science, research and infrastructure - Use synergies to mutual advantage

7 2 Project Partners AGI - Information Management Consultants - Focused on information and knowledge managment - Consulting - Software development and long-term maintenance - Use advanced recognition technologies in: Automatic indexing and text mining (CAI) Machine translation (MT) Optical character recognition (OCR) Recognition of text structures in PDF documents Voice recognition

8 2 Project Partners AGI - Information Management Consultants Products: - based on IBM technical platform Lotus Notes & Domino - intelligentCAPTURE -> tool for document capturing and machine indexing - IC INDEX -> tool for developing topic maps, taxonomies, thesauri and classifications - intelligentSEARCH -> tool for information retrieval, vizualization

9 2 Project Partners Libraries - University of Applied Sciences Dornbirn - University of Applied Sciences Kempten - University of Applied Sciences Liechtenstein - Central Library Zurich for University Zurich - University of Applied Sciences Konstanz - University of St. Gallen

10 3 Project tools intelligentCAPTURE - Software intelligentCAPTURE installed locally and connected to scanner - Workflow: - Identification of document via barcode - Scanning table of contents of books - Character recognition process (OCR) - Quick check of result of OCR

11 3 Project tools intelligentCAPTURE - Workflow (cont): - Generation of PDF file - Compression of files - Automatic indexing (CAI engine) - Transfer of PDF file to file system - Export of indexing results and PDF files to Local library system to Local intelligentSEARCH database to Central database, hosted by AGI

12 3 Project tools IC CAI Engine - Automatic indexing much more specific and comprehensive than just indexing of title and intellectual indexing with controlled vocabulary - Document analysis on basis of linguistic methods and procedures from computer linguistics - All words are reduced to linguistic base form (morphems) - Uses large semantic nets (thesauri, topic maps etc.) - Statistical rules for relevance ranking

13 3 Project tools IC CAI-Engine - Output of most important terms in groups: - geographical terms - personal/corporate terms - branches areas of activity - decriptors: words from internal thesaurus - important words and phrases from text - Libraries: use broad generic thesaurus, approx. 300‘000 German terms and smaller English thesaurus - Languages: German and English in use, French and Spanish available

14 Library1 iCAPT PDF Library 2Library 3 iCAPT PDF iCAPT PDF ILS Indexing ILS Indexing AGI

15 3 Project tools intelligent SEARCH - Search engine, simple (Google like) interface, with IBM GTR (Global Text Retrieval) as core engine - Search terms input -> automatically expanded semantically - Main features of GTR: Operators: Boolean, adjacency, near, paragraph sentence, right and left truncation, wildcard, fuzzy searching, sorting by relevance

16 3 Project tools intelligent SEARCH - AGI developed features: - Highlighting - Interfaces to library system, book seller, web via google - Query expansion by semantic nets - Vizualization and browsing of topic maps

17

18 4 Project Results Project Results - Library OPAC Vorarlberger Landesbibliothek: http://vlb-katalog.vorarlberg.at - Portal: www.dandelon.com

19 4 Project results www.dandelon.com - Portal with semantic search engine (intelligentSEARCH) - Content: automatically indexed content pages of books and other publications; PDF files of contents pages - Search terms expanded semantically - Relevance ranking - Highlighting

20 4 Project results www.dandelon.com - Links to libraries holding the book, to booksellers, to internet search engines - View topic maps


Download ppt "Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen."

Similar presentations


Ads by Google