Online Database vs. Web Search Engines 571-Information Access and Retrieval.

Slides:



Advertisements
Similar presentations
1 Use of Electronic Resources in Research Prof. Dr. Khalid Mahmood Department of Library & Information Science University of the Punjab.
Advertisements

Effective Searching Strategies and Techniques
Search Techniques Boolean Logic and Keyword Searching.
Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
Natural Language Processing WEB SEARCH ENGINES August, 2002.
PubMed and its search options Jan Emmerich, Sonja Jacobi, Kerstin Müller (5th Semester Library Management)
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Search Strategies Online Search Techniques. Universal Search Techniques Precision- getting results that are relevant, “on topic.” Recall- getting all.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Search engines. The number of Internet hosts exceeded in in in in in
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Using ProQuest Databases Jackson Community College Atkinson Library.
Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.
Internet Research Search Engines & Subject Directories.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
Web Searching. Web Search Engine A web search engine is designed to search for information on the World Wide Web and FTP servers The search results are.
SEARCHING ON THE INTERNET
Search Tools for the Internet Adapted from: Kathy Schrock M. Rosettis St. Augustine CHS.
SEARCH ENGINE By Ms. Preeti Patel Lecturer School of Library and Information Science DAVV, Indore E mail:
An introduction to databases In this module, you will learn: What exactly a database is How a database differs from an internet search engine How to find.
Internet Research, Second Edition- Illustrated 1 Internet Research: Unit A Searching the Internet Effectively.
Welcome to the Web of Science tutorial By the end of this tutorial you should be able to: Do a basic search to find references Use search techniques to.
Lesson 12 — The Internet and Research
Bio-Medical Information Retrieval from Net By Sukhdev Singh.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
NCBI/WHO PubMed/Hinari Course Introduction Session #1, Sept 13, 2005 Session #2, Sept 14, 2005 Internet Concepts and Scientific Literature Resources Ho.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Internet Business Foundations © 2004 ProsoftTraining All rights reserved.
CSCI-235 Micro-Computer in Science Internet Search.
Searching Information. General Steps Identifying Key Words, Synonyms, and Key Phrases Constructing an effective search statement Advance search/boolean.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
IL Step 2: Searching for Information Information Literacy 1.
Internet Search Strategies How and Where to Find What you Need on the Internet.
CONDUCTING RESEARCH How to find information on the Internet.
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
IL Step 3: Using Bibliographic Databases Information Literacy 1.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Introduction to Search Tools
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
Stop Searching and Start FINDING: Strategies for Effective Web Research.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
1 Internet Research Third Edition Unit A Searching the Internet Effectively.
CPT 499 Internet Skills for Educators Session Three Class Notes.
Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.
Searching for NZ Information in the Virtual Library Alastair G Smith School of Information Management Victoria University of Wellington.
Internet Research – Illustrated, Fourth Edition Unit A.
© 2010 Pearson Education, Inc. | Publishing as Prentice Hall. Computer Literacy for IC 3 Unit 3: Living Online Chapter 2: Searching for Information.
Unit 1—Computer Basics Lesson 3 The Internet and Research.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
1 SEARCHING FOR TRUTH Locating Information on the WWW chapter 5.
W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Finding quality information.
Learning how to search on the web “If all you ever do is all you’ve ever done, then all you’ll ever get is all you’ve ever got.” (author unknown)
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
CONDUCTING RESEARCH How to find information on the Internet.
Internet Searching: Finding Quality Information
Lesson 6: Databases and Web Search Engines
Search Engines & Subject Directories
Internet Research Third Edition
Data Mining Chapter 6 Search Engines
IL Step 2: Searching for Information
Lesson 6: Databases and Web Search Engines
Search Engines & Subject Directories
Search Engines & Subject Directories
Information Search Week 4.
Presentation transcript:

Online Database vs. Web Search Engines 571-Information Access and Retrieval

Online Database

Overview of Online Database 30 years (William (2006) From 1975 to 2005, databases increased considerably, from 301 to database records from 52 million to billion, and database entries from 301 to The number of producers has not grown as fast as databases because one producer might publish multiple databases. The number of publishers increased from 200 to 3208 from 1975 to In 2005, the average producer produced 5.13 databases. Since each vendor might provide services from multiple databases, the number of vendors grew at a slower pace from 105 to 2811.

Types of search Known item search Specific-information search Subject search Exploring/Browsing information Others

General search steps Search plan System access Database selection (Optional) Search query formulation Preliminary results evaluation Search query reformulation (Optional) Final results evaluation (Optional)

Some search Strategies Building blocks combine sub-searches combine sub-searches Citation pearl growing use the index term to retrieve further similar citations use the index term to retrieve further similar citations Successive fractions reduce the set using narrower index terms reduce the set using narrower index terms Most specific facet first start with the most specific concept start with the most specific concept

Search Strategy Formulation Imagine the title and keywords of relevant documents Boolean and, or, not and, or, not proximity operator adj, near, freq, atleast adj, near, freq, atleast search fields/segments au, co, ti, de au, co, ti, de Use controlled vocabulary to identify context truncation string string plural plural single character single character

How to find related Words? Personal knowledge terminology terminology relevant document relevant document Term mapping provided by system Feedback from search results title, descriptor, text title, descriptor, textOthers

Search Strategy Reformulation System search fields search fields vocabulary vocabulary more like this more like this refine search refine search Limit/focus search Limit/focus searchUser relevance feedback relevance feedback

Narrow search Find the right database Add another word or phrase Negative feedback (exclude one aspect of the search statement) Exclude related terminology Restrict to certain field title, descriptor, frequency, etc. title, descriptor, frequency, etc. Restrict to certain types of publication Restrict to certain time range Restrict to certain language

Evaluate search results Known item title, author, publication, date title, author, publication, date Specific information Key Word In Context (KWIC) Key Word In Context (KWIC) Subject information title, abstract, descriptor, full text title, abstract, descriptor, full text

Check for Tutorial for online databases WT3.html WT3.html

Web Search Engines

Characteristics of web IR Web documents Distributed stored Distributed stored Growing in size Growing in size Deep and surface documents Deep and surface documents Multiple formats Multiple formats Various in quality Various in quality Frequently changed Frequently changed Others OthersUsers Various user groups Various user groups Others OthersSystems

What is search engines? Users InternetSearch Engine

Key components Data collection Web spider or crawler Web spider or crawler Data processing Ranking Ranking Indexing Indexing Query formulating Interface InterfaceMatching Result displaying

How ranking works? Literally match Measure of word significance: The frequency of word occurrence (term frequency) Measure of word significance: The frequency of word occurrence (term frequency) location: relative position of a word location: relative position of a word Examples Examples ork.html ork.html nk.html nk.html

How ranking works? (Cont’) Hyperlinks (Brin&Page 1998) PR(A)=(1-d) + d(PR(T1)/C(T1) +…+PR(Tn)/C(Tn)) * PR(A)=(1-d) + d(PR(T1)/C(T1) +…+PR(Tn)/C(Tn)) * PA(A)—Page Rank of document A C(A)—Number of outgoing links from document A d—Dumping factor between *

Other Types of Search Engines Directories hierarchically organized indexes that allow you to browse through lists of web sites by category or subject hierarchically organized indexes that allow you to browse through lists of web sites by category or subject Meta-search engines query multiple search engines simultaneously and return a complete set of hits query multiple search engines simultaneously and return a complete set of hits Specialized search engines Create a database of sites on a specific topic using robots or spiders Create a database of sites on a specific topic using robots or spiders For specific user groups For specific user groups Visualization Visualization

Examples of Directories Yahoo Directory The Internet Public Library Librarians’ Index to the Internet INFOMINE, from the University of California, is a good example of an academic subject directory INFOMINE, from the University of California, is a good example of an academic subject directoryINFOMINE

Examples of Meta-Search Engines MetaCrawler Ixquick Clusty Mamma

More examples of Specialized Search Engines Career Mosaic Diseases, Disorders and related topics The Day in History Shareware.com

User Behaviors Web queries are short, not much modified, very simple in structure Very few advanced search features, if do so, half of them are mistakes View only first one or two pages No interested in relevance feedback

User search patterns in different environments (Jansen &Pooch, 2001)

Appendix A: Tips Most search engines employ the principles of Boolean logic in the formulation of search queries. If you take the time to understand the basics of Boolean logic, you will have a better chance of search success. Search engines tend to have a default Boolean logic. This means that the space between multiple search terms defaults to either OR logic or AND logic. This has become a de facto standard. It is imperative that you know which logical operator is the default. Nowadays, the default logic tends to be AND, but you should always check the site's Help file to make sure. Another de facto standard is the requirement to search for phrases within quotations, e.g., "dealth penalty".

Appendix A (Cont’) If the option is available, use proximity operators (e.g., NEAR) if these are available rather than specifying an AND relationship between your keywords. This will make sure that your search terms are located near each other in the full text document. The closer your terms are placed, the more possibly relevant the document will be. Google does proximity searching by default. Field searching is another extremely important way of limiting your search results in large search engines that contain millions of full-text files. For example, TITLE:slavery in a search engine such as AltaVista will bring you more relevant hits than merely searching on the keyword slavery. in a search engine such as AltaVista will bring you more relevant hits than merely searching on the keyword slavery. To enhance subject searches, try the URL field to narrow your results. The URL field offers a good way to search for certain subject terms. This is because of the make-up of the URL.

Appendix A (Cont’) The Internet is a self-publishing medium. It is not a library of evaluated publications selected by professionals. Rather, the Internet is a bulletin board containing everything from the definitive to the spurious. Everything, everything must be analyzed for its appropriateness for research use. Before you select a search tool, always think about your topic and what you are trying to find. Once you begin your research, be sure to try out a handful of sites. Don't rely on a single site. Don't just Google everything! Google is great, but there are other useful tools on the Web, too. Google has become so popular that many people use this tool exclusively, and miss out on others that might be more useful for their particular search. Others?

Appendix B Anatomy of a URL Anatomy of a URL This is a URL on the CNN home page: This is a URL on the CNN home page: This URL is typical of addresses hosted in domains in the United States: Protocol: http Host computer name: www Second-level domain name: cnn Top-level domain name: com Directory name: feedback File name: comments.html The directory name and file name often contain subject terms. These can be searched with the URL field. For example, URL:slavery will give you more relevant results than the keyword slavery by searching for this term as a directory name or a file name.

Appendix C Search engine comparison chart res/ res/ res/ res/Tutorials Google Tutorial Google Tutorial Google Tutorial Google Tutorial