Steve Cassidy Computing at MacquarieNo 1 Searching The Web Steve Cassidy Centre for Language Technology Department of Computing Macquarie University.

Slides:



Advertisements
Similar presentations
Relevance Feedback Limitations –Must yield result within at most 3-4 iterations –Users will likely terminate the process sooner –User may get irritated.
Advertisements

R2 Library Features and Functionality Overview. The R2 Library  The R2 Library is an electronic database that enables access to digital book content.
Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
Natural Language Processing WEB SEARCH ENGINES August, 2002.
Online Resources From Oxford University Press Overview This presentation gives a brief description of US Constitutional.
How does a web search engine work?. search  google (started 1998 … now worth $365 billion)  bing  amazon  web, images, news, maps, books, shopping,
1 Presented By Avinash Gutte Under The Guidance of Mrs. Hemangi Kulkarni Department of Computer Engineering Pimpri-Chinchwad College of Engineering, Pune.
Search Engines and Information Retrieval
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Properties of Text CS336 Lecture 3:. 2 Information Retrieval Searching unstructured documents Typically text –Newspaper articles –Web pages Other documents.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Searching the World Wide Web From Greenlaw/Hepp, In-line/On-line: Fundamentals of the Internet and the World Wide Web 1 Introduction Directories, Search.
The Anatomy of a Large-Scale Hypertextual Web Search Engine Sergey Brin and Lawrence Page Distributed Systems - Presentation 6/3/2002 Nancy Alexopoulou.
Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.
Chapter 5: Information Retrieval and Web Search
Hudson Valley Community College Marvin Library GOOGLE SCHOLAR
Chapter 10 Publishing and Maintaining Your Web Site.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Internet Research, Second Edition- Illustrated 1 Internet Research: Unit A Searching the Internet Effectively.
Searching the Internet Using Google Tips and Tricks.
Search Engines and Information Retrieval Chapter 1.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
1 Searching through the Internet Dr. Eslam Al Maghayreh Computer Science Department Yarmouk University.
English 115 GoogleScholar/ OneSearch Hudson Valley Community College Marvin Library Learning Commons 1.
Support.ebsco.com EBSCOhost Basic Searching for Academic Libraries Tutorial.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
1 An Overview of Telecommunications Telecommunications: the electronic transmission of signals for communications Telecommunications medium: anything that.
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Chapter 9 Publishing and Maintaining Your Site. 2 Principles of Web Design Chapter 9 Objectives Understand the features of Internet Service Providers.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Search engines are used to for looking for documents. They compile their databases by employing "spiders" or "robots" to crawl through web space from.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Curtis Spencer Ezra Burgoyne An Internet Forum Index.
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
OWL Representing Information Using the Web Ontology Language.
Mr C Johnston ICT Teacher G042 – Lecture 02 Using Logical Operators To Aid Searching.
Internet Research – Illustrated, Fourth Edition Unit A.
What is Web Information retrieval from web Search Engine Web Crawler Web crawler policies Conclusion How does a web crawler work Synchronization Algorithms.
The World Wide Web: Information Resource. How a Search Engine works… How Search Works - YouTube
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Web Search Architecture & The Deep Web
G053 - Lecture 02 Search Engines Mr C Johnston ICT Teacher
The Internet is a Big Collection of Computers and Cables. -"interconnection of computer networks". Millions of personal, business, and governmental.
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Using Google Scholar Ronald Wirtz, Ph.D.Calvin T. Ryan LibraryDec Finding Scholarly Information With A Popular Search Engine Tool.
Objectives Create a folder in Google Drive.
Using Search Tools on the Internet
Search Engines and Search techniques
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Internet Searching: Finding Quality Information
Basic Searching for K-12 School Libraries
Searching for and Accessing Information
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Eric Sieverts University Library Utrecht Institute for Media &
Information Retrieval
Introduction into Knowledge and information
Data Mining Chapter 6 Search Engines
IL Step 3: Using Bibliographic Databases
Website A website is a collection of web pages (documents that are accessed through the Internet) When someone gives you their web address, it generally.
Information Retrieval and Web Design
Lesson 2: Gathering and Organizing Information Using ICT KEY QUESTION: HOW DO YOU GATHER AND ORGANIZE INFORMATION USING THE COMPUTER AND INTERNET?
Presentation transcript:

Steve Cassidy Computing at MacquarieNo 1 Searching The Web Steve Cassidy Centre for Language Technology Department of Computing Macquarie University

Steve Cassidy Computing at MacquarieNo 2 The First Web Page

Steve Cassidy Computing at MacquarieNo 3 What is the Web? Documents, text, images, sound A web of hyperlinks –Link one (text) document to others Easy to join –Any Internet user can be a publisher Anarchic –No-one is in charge Very big

Steve Cassidy Computing at MacquarieNo 4 The Problem Much of the information available is text-based Text is difficult to process by computers The popular use of computers and the Internet has increased the availability of text-based information Information Overload

Steve Cassidy Computing at MacquarieNo 5 The Solution? Only one of the top four commercial search engines finds itself The best navigation should make it easy to find almost anything on the web (once all the data is entered) ‏ The Web 1997

Steve Cassidy Computing at MacquarieNo 6 How do they work? Two major steps –Build an inverted index –Match query terms in the index Problems –The web is very big –Finding relevant documents –Avoiding false hits

Steve Cassidy Computing at MacquarieNo 7 Inverted Index document D1 D2 D3 D1 D1 D3 D1 D2 computer software information language computer software information language computer library retrieval computer information retrieval filtering D1 D2 D3 document

Steve Cassidy Computing at MacquarieNo 8 Building the Index List of web addresses Download web page Parse Web page Index New links Web page text

Steve Cassidy Computing at MacquarieNo 9 Building the Index List of web addresses Download web page Parse Web page Index New links Web page text <a name="works"> How Google Works If you aren't interested in learning how Google creates the index and the database of documents that it accesses when processing a query, skip this description. I adapted the following overview from Chris Sherman and Gary Price's wonderful description of How Search Engines Work in Chapter 2 of The Invisible Web (CyberAge Books, 2001). Google consists of three distinct parts, each of which is run on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneiously, significantly speeding up data processing.

Steve Cassidy Computing at MacquarieNo 10 Using the Index D1 D2 D3 D1D1 D3 computer software information document D1 D2 language Query: computer software information D1 D2 D3 D1 D3 D1

Steve Cassidy Computing at MacquarieNo 11 Server Farm Over 10,000 computers Each with a copy of the index

Steve Cassidy Computing at MacquarieNo 12 Relevance Finding pages with search terms is easy Which ones are the best? Google: –Text in titles, headings is important –Text earlier in the page is important –Text of links to this page is important –Important pages link to other important pages

Steve Cassidy Computing at MacquarieNo 13 Making the Most of Search Engines Use words likely to appear in the pages you want Use more query terms to narrow your result Be brief Don’t worry about spelling Use “words in quotes” to search for phrases

Steve Cassidy Computing at MacquarieNo 14 Other Search Engines –Offers ‘refine your search’ –Subject specific popularity –Natural language questions search.yahoo.com

Steve Cassidy Computing at MacquarieNo 15 The Future Information Extraction –Find all the details of this conference for my diary Question Answering –When did Armstrong land on the moon? The Semantic Web –Exchanging machine readable data

Steve Cassidy Computing at MacquarieNo 16 Language Technology SLP148 Language, Logic and Computation COMP248 Language Technology COMP249 Web Technology COMP348 Document Processing and the Semantic Web COMP349 Spoken Language Dialogue Systems

Steve Cassidy Computing at MacquarieNo 17 Questions?