Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basics of Databases and Information Retrieval1 Databases and Information Retrieval Lecture 1 Basics of Databases and Information Retrieval Instructor Mr.

Similar presentations


Presentation on theme: "Basics of Databases and Information Retrieval1 Databases and Information Retrieval Lecture 1 Basics of Databases and Information Retrieval Instructor Mr."— Presentation transcript:

1 Basics of Databases and Information Retrieval1 Databases and Information Retrieval Lecture 1 Basics of Databases and Information Retrieval Instructor Mr. Gautam Das University of Texas at Arlington Email: gdas@cse.uta.edugdas@cse.uta.edu

2 Basics of Databases and Information Retrieval2 Database Consist of Schema Relational Model Data stored in form of tables Follow typical Query Model and Joins Output in form of tuples which are made of joins from one or more tables IR Data Collection of Documents { Unstructured piece of information } Follows Rank and Relevance query model Output is the document

3 Basics of Databases and Information Retrieval3 Types of Queries Conjunctive Queries { Car, Accident } Will search for the word either “Car” or “Accident”. General Boolean Queries { Car + Accident – Arlington } Will Search for words “Car” and “Accident” but should not have word “Arlington”.

4 Basics of Databases and Information Retrieval4 Retrieval Models of IR Boolean Retrieval Model Ranked / Relevance Retrieval Model { One which is missing in databases }

5 Basics of Databases and Information Retrieval5 Parameters Used for Ranking in Typical Information Retrieval System Parameter 1 Occurrence and Frequency The number of times the specified word occurs in the document decides the rank The position it occurs at e.g. Title, Sub Title.

6 Basics of Databases and Information Retrieval6 Parameters Used for Ranking in Typical Information Retrieval System Parameter 2 Proximity If two or more words are specified in the search string then the documents containing those words near to each other should be ranked higher.

7 Basics of Databases and Information Retrieval7 Parameters Used for Ranking in Typical Information Retrieval System Parameter 3 Stemming Uses various verbal forms of word for seraching. E.g. Run => Ran, Run over, Running Exact match of word should be ranked higher E.g. If the word “info” is searched then the document containing word “infotech” should be ranked after the document containing exact match as “info”.

8 Basics of Databases and Information Retrieval8 Parameters Used for Ranking in Typical Information Retrieval System Parameter 4 Frequency across Documents The words like a, an, the etc. should be suppressed as more probability is that those are irrelevant as far as searching criteria is concerned. If we are searching for ‘Microsoft Corporation’ then the specific word “Microsoft” is more important than the general word “Corporation”

9 Basics of Databases and Information Retrieval9 Parameters Used for Ranking in Typical Information Retrieval System Parameter 5 Page Access Frequency If the page is accessed more number of times i.e. If the page is popular then it should be ranked higher This kind of ranking requires to maintain log about the frequency of page access Useful in case of systems which store News, Stories or readable articles.

10 Basics of Databases and Information Retrieval10 Parameters Used for Ranking in Typical Information Retrieval System Parameter 6 Number of In-Links to the Page It is the number of times other pages on web are having links to the page be ranked. Again a parameter for deciding the popularity of a page.


Download ppt "Basics of Databases and Information Retrieval1 Databases and Information Retrieval Lecture 1 Basics of Databases and Information Retrieval Instructor Mr."

Similar presentations


Ads by Google