Indexing and Search Engines for the Intranets By Suvarsha Walters

Slides:



Advertisements
Similar presentations
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
Advertisements

Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Previous Lectures: Planning of a Web site: Discussing the strategic issues of Web site engineering process –Models used for Web site planning –Compare.
Web Server Hardware and Software
INTERNET DATABASE Chapter 9. u Basics of Internet, Web, HTTP, HTML, URLs. u Advantages and disadvantages of Web as a database platform. u Approaches for.
INTERNET DATABASE. Internet and E-commerce Internet – a worldwide collection of interconnected computer network Internet – a worldwide collection of interconnected.
1 ETT 429 Spring 2007 Microsoft Publisher II. 2 World Wide Web Terminology Internet Web pages Browsers Search Engines.
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
Searching the World Wide Web From Greenlaw/Hepp, In-line/On-line: Fundamentals of the Internet and the World Wide Web 1 Introduction Directories, Search.
Introduction 2: Internet, Intranet, and Extranet J394 – Perancangan Situs Web Program Sudi Manajemen Universitas Bina Nusantara.
Overview of Search Engines
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
Chapter 10 Publishing and Maintaining Your Web Site.
1 Internet Search Tools Adapted from Kathy Schrock’s PowerPoint entitled “Successful Web Search Strategies” Kathy Schrock’s complete PowerPoint available.
Sharepoint Portal Server Basics. Introduction Sharepoint server belongs to Microsoft family of servers Integrated suite of server capabilities Hosted.
Databases & Data Warehouses Chapter 3 Database Processing.
Microsoft Access Database software. What is a database? … a database is an organized collection of data. A collection of data of similar information compiled.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.
Lesson 12 — The Internet and Research
ITIS 1210 Introduction to Web-Based Information Systems Chapter 24 How Websites Work with Databases How Websites Work with Databases.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
HOW SEARCH ENGINE WORKS. Aasim Bashir.. What is a Search Engine? Search engine: It is a website dedicated to search other websites and there contents.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Chapter 6 The World Wide Web. Web Pages Each page is an interactive multimedia publication It can include: text, graphics, music and videos Pages are.
XHTML Introductory1 Linking and Publishing Basic Web Pages Chapter 3.
Using a Web Browser What does a Web Browser do? A web browser enables you to surf the World Wide Web. What are the most popular browsers?
Using Visual Basic 6.0 to Create Web-Based Database Applications
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Web Indexing and Searching By Florin Zidaru. Outline Web Indexing and Searching Overview Swish-e: overview and features Swish-e: set-up Swish-e: demo.
WHAT IS A SEARCH ENGINE. Widescreen Presentation Proteus, Keeper of Knowledge. Proteus is synonymous with change and success.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Universiti Utara Malaysia Chapter 3 Introduction to ASP.NET 3.5.
1/28: The Internet & Website Design What is the Internet? –Parts of the Internet –Internet & WWW basics –Searching the WWW Website design considerations.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
SharePoint 2010 Search Architecture The Connector Framework Enhancing the Search User Interface Creating Custom Ranking Models.
UNESCO ICTLIP Module 1. Lesson 61 Introduction to Information and Communication Technologies Lesson 6. What is the Internet?
Chapter 9 Publishing and Maintaining Your Site. 2 Principles of Web Design Chapter 9 Objectives Understand the features of Internet Service Providers.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Restricted Search Engine Laurent Balat Christophe Decis Thomas Forey Sebastien Leclercq ESSI2 Project Supervisor: Johny BOND June 2002.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Search Engines.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Search Tools and Search Engines Searching for Information and common found internet file types.
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
1 SEARCHING FOR TRUTH Locating Information on the WWW chapter 5.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
A search engine is a web site that collects and organizes content from all over the internet Search engines look through their own databases of.
CPSC 203 Introduction to Computers T97 By Jie (Jeff) Gao.
Electronic Commerce Semester 1 Term 1 Lecture 7. Introduction to the Web The Internet supports a variety of important tools, such as file transfer, electronic.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
The Internet. The Internet and Systems that Use It Internet –A group of computer networks that encircle the entire globe –Began in 1969 Protocol –Language.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Seminar on seminar on Presented By L.Nageswara Rao 09MA1A0546. Under the guidance of Ms.Y.Sushma(M.Tech) asst.prof.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
SEARCH ENGINE by: by: B.Anudeep B.Anudeep Y5CS016 Y5CS016.
Types of Search Questions
SEARCH ENGINE OPTIMIZATION. P RESENTATION O VERVIEW  Search Engine Basics  What is SEO?  Key Concepts  Why is Search Engine marketing important? 
WEB PAGES AND WEB SITES.
Presentation transcript:

Indexing and Search Engines for the Intranets By Suvarsha Walters

Overview  Introduction  Types of Searching  Parts of a Local Search Engine  Working of a Local Search Engine  Choosing a search engine  List of some Intranet Search Engines  Conclusion  References

Introduction – Searching and Search Engines  A good site is one in which ‘content is king’  A lot of information makes a site huge, complex and navigation difficult  Search is the user's lifeline for mastering complex websites  Search feature is essential for users when they revisit a site, looking for specific info

Introduction – Searching and Search Engines  Search is also users' escape hatch when they are stuck in navigation. When they can't find a reasonable place to go next, they often turn to the site's search function.  This is why site search is an important feature of any site of reasonably size

Types of Searching A search can be of various types:  Internet Search: Search Engines like Yahoo, Infoseek crawl the web gathering web pages or info on web pages, index them and retrieve them when the specific term is found  Database search: Databases store their information neatly organized into fields. A search Interface is provided for this.

Types of Searching  With databases one can set up complex queries to find the search words in all applicable fields.  But this makes them slower to respond, requires more memory, and requires programming.  Database search is not oriented towards text search and relevance ranking: they are great for listing of inventory or directory of the institute

Types of Searching  Intranet search: Search is restricted to a site or a group of sites.  Text search engines store this information in one index and can find words in any field for a record.  Many high-end search engines can also store field information, so searches can be limited to a specific field as well.

Parts of a Local Site Search Tool  Search Indexer  The program that recognizes and creates an index of all the documents on the site. The index is stored in a file called as the index file, where the search engine will find them.  Search Index File  Created by the Search Indexer program, this file stores the data from the site in a special index or database, designed for very quick access.

Parts of a Local Site Search Tool  Search Form  HTML interface to the site search tool, provided for visitors to enter their search terms and specify their preferences for the search  Search Engine  The program (CGI, server module or separate server) that accepts the request from the form or URL, searches the index, and returns the results page to the server

Parts of a Local Site Search Tool  Results Listing  HTML page listing the pages which contain text matching the search term(s). These are sorted in some kind of relevance order, with the closest match at the top. The format of this is often defined by the site search tool, but may be modified in some ways.

Working of a Local Search Engine Search Form Indexer Web Site Documents Gets words Index Stores Words Looks in Index Gets Matches Sends Query Search Engine Results Page Sends Formatted Results Retrieved Page User views Retrieved Page User Selects required page

Types of Search Engines  CGI Programs  The Common Gateway Interface (CGI) standard allows a web server to communicate with external programs. CGI Programs run as Search Engines.  Server Plug-Ins  For better data interchange, less overhead and more flexibility, web server companies have defined APIs (Application Programmer Interfaces) to their servers. This allows third-party developers to create modules for the servers which run inside the server process

Types of Search Engines  Search Servers  Some search engines run as separate servers. The form data is passed as part of the URL, just like a URL, but the search engine application runs as a separate HTTP server on a different machine. This reduces the load on the main web server.  Remote Searching  It is also possible to outsource search to a remote site search service. The indexer and search engine run on the remote server. using a web indexing robot, or spider, they follow links on the site and read the pages, then store every word in the index file on that server. When it comes time to search, the form on the site Web page send a message to the remote search engine which sends results back to the site.

Choosing a Site Search Tool  Technical Considerations  Indexing Features  Searching Capabilities  Results display  Costs, licensing and registration requirements  Unique features (if any)

Features of search engines: Technical Considerations

Features of search engines: Indexing features

Features of search engines Search Capabilities

Features of search engines Searching features

Features of search engine Results Display

Choosing the right search engine  Checklist of factors to be considered while selecting the search engine: –Size of the website –Technical expertise available (local and/or from the supplier / developer) –System platforms available –Information sources and services to be supported –Document collection: type, volume (now and in future) –Indexing, search and display requirements

Choosing the right search engine  Checklist of factors to be considered while selecting the search engine: –User community to be served –Differentiate between the need for indexing the web site pages and the need for indexing databases / document collections (text, bibliographic, DBMS, etc.) – Support for the concept of a "record" by the search engine. – Support for structured fields and metadata – Cost

Choosing the right search engine  Steps in the selection and procurement of search engines: - C onduct a needs analysis. - Talk to other libraries - Attend trade shows and talk to vendors - Read the literature that reviews search engines. - Compile a list of possible products..

Choosing the right search engine  Steps in the selection and procurement of search engines: –Compare the functionality of each product to the criteria you developed through needs analysis –Narrow your list down to three possible products. –Spend additional time learning about each product. –Invite the vendors in for demonstrations. –Ask for references and follow up with each reference –Select product and implement. –Follow up with end users. –Continue an on going review with end users.

Choosing the right search engine  Some Suggestions –The search system development or selection should be based primarily on the local needs –Consider using freeware search engines, if your requirements are met by these. –For large, highly developed intranet sites, you may like to consider commercial search engines –Consider if the webserver you are using supports indexing and search, and if this is adequate for you.

Choosing the right search engine –The IT Professionals should make an effort to keep themselves abreast of the current web technologies –The features available within a tool should be made use of properly to get maximum benefits –Carefully consider interrelations between the three major components: document resources, users and the search engines.

Conclusion  Since search is such a common activity, the search box should appear on every page of your web site.  The initial target of the basic search should be the contents of the entire web site.  The basic search should allow for Boolean commands ("and," "or"), although this does not need to be explained.

Conclusion  A quality search process begins with quality metadata. It's that old principle: Garbage in, garbage out. Metadata is about giving a structure the the content. For example, if every document is assigned keywords or or classified by Geography, the reader will get a much more accurate return from his or her search.  Search engines are the mortar of the Intranet. As important as they are, their implementation must be given high priority with the necessary time allotted for research and development

List of some (Free) Intranet Indexing Tools (for Windows)  Microsoft Index Server rSummary.asp  DeepSearch  Harvest  HomepageSearchEngine  Swish-E

List of some (Free) Intranet Indexing Tools (for Windows)  PLWeb Turbo (PLS / AOL)  Namazu  Oracle interMedia  HomepageSearchEngine  Sharewire SiteSearch

Free and commercial search engines  For HTML and text files (web site indexing and file/directory level indexing) –SWISH-E (sunsite.berkeley.edu/SWISH-E/) –ht://Dig (htdig.sdsu.edu/) –Excite For Web Servers ( –WebGlimpse (glimpse.cs.arizona.edu/webglimpse/  For structured/formatted data - MYSQL (

Free and commercial search engines  Commercial search engines –AltaVista ( –Fulcrum ( ) –Infoseek (software.infoseek.com) –Open Text ( –Oracle ( –PLS ( –Verity (

 Practical Example of Choosing a Site Search Tool  University of Pennsylvania  Search Engine Watch Page  Web Admin's Guide to Site Search Tool  List of Search Tools  Review of Remote Search Services References