Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital libraries and web- based information systems Mohsen Kamyar.

Similar presentations


Presentation on theme: "Digital libraries and web- based information systems Mohsen Kamyar."— Presentation transcript:

1 Digital libraries and web- based information systems Mohsen Kamyar

2 Outline What is a Digital Library? Brief history Enabling Semantic Web Some projects

3 What is a Digital Library? A collection of books A collection of documents Web?

4 Brief History The goal is to make web contents understandable for machine Last efforts can be categorized in digital libraries scope In a library there are some cards that contain brief information about each book. (making this information as wide as possible and understandable for machine)

5 Brief History (continued) And according to Milne “providing structures is one of the things that Description Logic do best!” Web is a large sample of digital libraries Web has documents that there are not any information about them and we should provide some way for describing the underlying semantic

6 SGML to Semantic Web One of the first efforts in the field of Digital Library was the SGML (Standard Generalized Markup Language). This language uses tags for describing the meaning of each word ( Washington ) At last SGML and HTML were combined and XML was generated for describing the semantic as “meta data”.

7 SGML to Semantic Web To prevent the web community from repeating its mistakes in knowledge representation DLs community became very active and offered many languages for describing the semantic.

8 Untangle A system for representing card-catalog information It has a web-based interface and we can say that it is the first DL system with web-based interface It used Lisp and its HTTP Server for providing a web-based interface The goal was using DL in Digital Library (classification and information retrieval)

9 Untangle And now one of the main concerns in the web scope is the classification of the information There are some practical experiences in this project: we should define the objects that are naturally distinguishable as primitive concepts and others as defined concepts; for example book and journal are two primitive concepts. This project also exposed some limitations of DL for modelling.

10 Untangle One of the limitations is the trade-off between defining objects as concepts or as individuals; for example a person should be defined as an individual, but some persons are subject of many books then for efficiency purposes we should define them as concepts. For this reasons in this projects some special concepts was added to DL.

11 FindUR The main goal of this project was query expansion. In query expansion we expand a query with its synonyms and hyponyms. There are two main criterion in information retrieval field: Ratio of the desired to undesired returned pages (precision, false positive) Ration of desired missed pages to desired pages (recall, false negative)

12 FindUR If we use synonyms and hyponyms in a search we can reduce the false-negative ratio. In this project, Wordnets were used for specifying the synonyms and hyponyms. This project was expanded. It found many complicated applications; for example complex search phrases in news and etc.

13 Alvis Superpeer Semantic Search Engine Partners Helsinki university of technology EPFL (Switzerland)... This project has three stages: Crawling the web using a subject-specific crawler and converting the documents to a simple XML format

14 Alvis Analyzing documents using a domain-specific linguistic analyzer and annotating the documents. In this stage semantic tagging can be included. In this stage the generated document will be indexed according to its domain and will take a score. In this stage some categories will be assigned to each document, and finally probabilistic approaches will be used for searching. http://www.alvis.info


Download ppt "Digital libraries and web- based information systems Mohsen Kamyar."

Similar presentations


Ads by Google