Book: Bayesian Networks : A practical guide to applications Paper-authors: Luis M. de Campos, Juan M. Fernandez-Luna, Juan F. Huete, Carlos Martine, Alfonso.

Slides:



Advertisements
Similar presentations
Limitations of the relational model 1. 2 Overview application areas for which the relational model is inadequate - reasons drawbacks of relational DBMSs.
Advertisements

Chapter 5: Introduction to Information Retrieval
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Multimedia Database Systems
INSTRUCTOR: DR.NICK EVANGELOPOULOS PRESENTED BY: QIUXIA WU CHAPTER 2 Information retrieval DSCI 5240.
Query Languages. Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Query Expansion in Information Retrieval using a Bayesian Network-Based Thesaurus Luis M. de Campus, Juan M. Fernandez, Juan F. Huete.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
IR Models: Overview, Boolean, and Vector
Information Retrieval in Practice
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graph.
IR Models: Structural Models
Models for Information Retrieval Mainly used in science and research, (probably?) less often in real systems But: Research results have significance for.
Incorporating Language Modeling into the Inference Network Retrieval Framework Don Metzler.
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Vector Space Model CS 652 Information Extraction and Integration.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Other IR Models Non-Overlapping Lists Proximal Nodes Structured Models Retrieval: Adhoc Filtering Browsing U s e r T a s k Classic Models boolean vector.
Overview of Search Engines
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Bayesian Network Student Model for Adapting Learning Activity Tasks in Adaptive Course Generation System Introduction Adaptive educational hypermedia system.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Querying Structured Text in an XML Database By Xuemei Luo.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
A Probabilistic Quantifier Fuzzification Mechanism: The Model and Its Evaluation for Information Retrieval Felix Díaz-Hemida, David E. Losada, Alberto.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Web- and Multimedia-based Information Systems Lecture 2.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Information Retrieval
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Term Weighting approaches in automatic text retrieval. Presented by Ehsan.
Introduction on Graphic Models
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
1 © 2013 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the.
General Architecture of Retrieval Systems 1Adrienn Skrop.
A Probabilistic Quantifier Fuzzification Mechanism: The Model and Its Evaluation for Information Retrieval Felix Díaz-Hemida, David E. Losada, Alberto.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Recuperação de Informação B Modern Information Retrieval Cap. 2: Modeling Section 2.8 : Alternative Probabilistic Models September 20, 1999.
Information Retrieval in Practice
Search Engine Architecture
Lecture 1: Introduction and the Boolean Model Information Retrieval
Associative Query Answering via Query Feature Similarity
Multimedia Information Retrieval
MPEG-7 Video Retrieval using Bayesian Networks
Ying Dai Faculty of software and information science,
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
Introduction to XML IR XML Group.
Presentation transcript:

Book: Bayesian Networks : A practical guide to applications Paper-authors: Luis M. de Campos, Juan M. Fernandez-Luna, Juan F. Huete, Carlos Martine, Alfonso E. Romero Chapter: 12 Presented by Quratulain CSE 655 Probabilistic Reasoning Faculty of Computer Science, Institute of Business Administration

Outline Introduction Overview of information retrieval systems Bayesian network and information retrieval Theoretical foundations Building the information retrieval system Conclusion 10 oct, 20092Quratulain

Introduction/Motivation To fulfil the objective of democracy, need to make public all activities of parliament. Previously, information was sent in a printed form to all official organization and libraries. Currently, electronic document published on the web, which is fast, cheaper and an easier way. The official bulletin, transcripts of all speeches in different session, after editing published on website in PDF. The documents are accessible using database-like queries. 10 oct, 20093Quratulain

Problems To access information user must know about: Session number Date of legislature Difficult to access information 10 oct, 2009Quratulain4

Goal A website with real search engine based on content. The natural language query is applied to access the information. The obtained the relevant document through system. The output will be a set of document components of varying granularity (from complete document to single paragraph, also sorted depending on degree of relevance). ** This will avoid manual search ** 10 oct, 2009Quratulain5

Outline Introduction Overview of information retrieval systems Bayesian network and information retrieval Theoretical foundations Building the information retrieval system Conclusion 10 oct, 20096Quratulain

Overview of information retrieval Information retrieval is concerned with representation, storage, organization, and accessing of information items. Information retrieval systems work as: Given a set of documents Pre-processing remove words not useful in search(stopwords) Convert word to its stem word(reduce vocabulary) Each word is associated with weights expressing their importance (in document or collection of documents) NLP query indexed to match query representation with the stored document using any IR model. Finally, a set of document identifiers is presented to the user sorted according to their relevance degree. 10 oct, 2009Quratulain7

Overview of information retrieval Standard IR treat document as atomic entities. XML allows structured documents with semantics. Structured IR views documents as aggregates interrelated structural elements by indexing. Structured IR models exploit the content and the structure of documents to estimate the relevance of document components to query. 10 oct, 2009Quratulain8

Outline Introduction Overview of information retrieval systems Bayesian network and information retrieval Theoretical foundations Building the information retrieval system Conclusion 10 oct, 20099Quratulain

Bayesian Networks and information retrieval Bayesian networks were first applied to IR at the beginning of 1990 by croft and turtle. Bayesian network in IR models compute the probability of relevance given a document and a query. Two important model of BNs within IR: Belief network model Bayesian network retrieval model. Common feature are: Each index term and document represented as nodes in network. Links connecting each document node with all the term nodes. Model differ in: The direction of arc. Additional arc (relationship b/w documents and terms.) 10 oct, 2009Quratulain10

BN-based retrieval model 10 oct, 2009Quratulain11 D2 T1 D1 T7T6T5T4T3T2 D3 Terms Documents

Drawback of Bayesian network 1. Time and space require to assess the distributions and store them (conditional probability per node is exponential with the parent nodes ) 2. The efficiency of carrying out inference, because general inference in BNs is NP-hard problem Therefore The direct approach where we propagate the evidence contained in a query through the whole network is unfeasible. 10 oct, 2009Quratulain12

Outline Introduction Overview of information retrieval systems Bayesian network and information retrieval Theoretical foundations Building the information retrieval system Conclusion 10 oct, Quratulain

Theoretical foundations Set of documents D={D 1,D 2,..., D M } Set of terms used to index these documents Each document D i is organized hierarchically, representing structural associations of elements in D i called structural unit. These association to a document form a tree. For example scientific article. 10 oct, 2009 Quratulain 14

The structure of scientific article 10 oct, 2009Quratulain15 Index Terms TitleParag 1Parag 2TitleParag 1 TitleParag 1 Ref 1Ref 2 Subsec 1Subsec 2 Section 1 Section 2 Bibligraphy TitleAuthorAbstract Document 1

BN model for document BN modeling of document contain 3-kind of nodes Terms set, T={T 1, T 2,..., T l } Basic structural unit, U b ={B 1, B 2,..., B m } Complex structural unit, U c ={S 1, S 2,..., S m } Set of all structural unit U= U b U c To each node T, B, S is associated a binary random variables as {t -, t + }, {b -, b + } or {s -, s + } respectively. (-) not relevant, (+) relevant. 10 oct, 2009Quratulain16

BN model for document 10 oct, 2009Quratulain17 UbUb T1T6T11T10T9T8T2T3T4T5T7T16T15T14T13T12B1B6B2B3 B4 B5B7S1 S2 S3 S4 UcUc UcUc UsUs, with Pa(S 1 )Pa(S 2 ) =,S1S1 S2S2 UcUc

BN for document Conditional Probability P(t + ) P(b + |pa(B)) P(s + |pa(S)) Due to greater number of parent, efficient inference procedure is needed. 10 oct, 2009Quratulain18

Influence Diagram Model Once the BN has been constructed transform it into influence diagram by including decision and utility nodes. Chance node : previous BN Decision node : Utility node : 10 oct, 2009Quratulain19

Outline Introduction Overview of information retrieval systems Bayesian network and information retrieval Theoretical foundations Building the information retrieval system Conclusion 10 oct, Quratulain

Building the information retrieval system(PAIRS) PAIRS is a software package (store document in relational database) Written in C++ Specifically developed to store and retrieve documents generated by the parliament of Andalusia Based on probabilistic model. 10 oct, 2009Quratulain21 PDF document collection XML document collection Indexing System Query Indexed Query Search Engine Indexed Document Collection Retrieved Document Components General scheme of PAIRS

Outline Introduction Overview of information retrieval systems Bayesian network and information retrieval Theoretical foundations Building the information retrieval system Conclusion 10 oct, Quratulain

Conclusion This paper present a retrieval system based on probabilistic model belong to parliament information. The system has been proven efficient in term of indexing and retrieval time. Bayesian network technologies can be employed in problem domains whose dimensionality would earlier avoid its use. The system is not a finished product, still several possible improvement are required. 10 oct, 2009Quratulain23