Principles of Searching © Tefko Saracevic1 Web searching & the invisible web Finding things that are hard to find.

Slides:



Advertisements
Similar presentations
Lake Land College Library Tim Schreiber Information Services Librarian.
Advertisements

Information Technology People © Tefko Saracevic1 Keeping up & up & up & up & up & up Key to you professional success & even longevity.
ENG 102 Persuasion Steps of Library Research Gergana Georgieva Information Literacy Librarian March, 2010.
Writing Across the Profession Part II Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science, University of Iowa.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
“The Computer as an Educational Tool: Productivity and Problem Solving” ©Richard C. Forcier and Don E. Descy.
Exploring the Academic Invisible Web Das wissenschaftliche Invisible Web erkunden Dr. Dirk Lewandowski Heinrich-Heine-Universität Düsseldorf, Information.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
Copyright © Allyn & Bacon 2009 Public Speaking: An Audience-Centered Approach – 7 th edition Chapter 7 Gathering Supporting Material This multimedia product.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Tefko Saracevic 1 search engines digital libraries
Web Evaluation Websites and the Invisible Web HIST 221/INFO 221 February 25, 2004 Presented By: Teresa Ferguson
Principles of Searching Tefko Saracevic1 Web searching & the invisible Web Finding things that are hard to find
© Tefko Saracevic, Rutgers University1 Services in digital libraries Following functions? Following new capabilities?
© Tefko Saracevic 1 part 1: search engines part 2: digital libraries.
Google & Beyond Expert Internet Searching Tools & Strategies.
© Tefko Saracevic 1 part 1: search engines part 2: digital libraries.
© Tefko Saracevic, Rutgers University 1 evaluating information on the web Tefko Saracevic School of Communication, Information and Library Studies Rutgers.
© Tefko Saracevic, Rutgers University1 Web sources and library & information services Finding, evaluating and using a variety of Web sources for searching.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
digital libraries internationally projects, applications, research in many countries © Tefko Saracevic Rutgers University
© Tefko Saracevic, Rutgers University1 The Invisible Web - finding things that are hard to find - Tefko Saracevic, PhD Rutgers University
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.
© Tefko Saracevic, Rutgers University1 The Invisible Web Tefko Saracevic, PhD Rutgers University ( contains also a.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
HINARI/Health Information on the Internet (module 1.3 Part A)
Araba Dawson-Andoh 122 A Alden Library
Tefko Saracevic 1 Diversity in digital libraries Related types, institutions, forms.
RESEARCHING TIPS & STRATEGIES Summer 2008 Melanie Wilson Academic Success Center MSC 207.
The Latest in Information Technology for Research Universities.
IL Step 1: Sources of Information Information Literacy 1.
Databases and Library Catalogs Global Index Medicus/Global Health Library PubMed Source Bibliographic Database: International Health and Disability.
Finishing the Puzzle of Research. Select Research Topic Brainstorm for Keywords Develop a Search Strategy Search for Books (Catalog) Search for Periodical.
Welcome to Georgia Library Learning Online for K-12 Schools
Chapter 14 a Guide to Print, Electronic, and Other Sources.
Bio-Medical Information Retrieval from Net By Sukhdev Singh.
RESEARCHING & EVALUATING Summer 2008 Melanie Wilson Academic Success Center MSC 207.
Week 9 Search Engines and the Invisible Web. Resource Pages Collections of Links Compiled by “experts” Sometimes annotated Targeted Information for a.
NCBI/WHO PubMed/Hinari Course Introduction Session #1, Sept 13, 2005 Session #2, Sept 14, 2005 Internet Concepts and Scientific Literature Resources Ho.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
The Research Process Getting the Information You Need.
Beyond Search Engines: Advanced Web Searching Subject Directories  Librarians’ Index to the Internet  Infomine Finding Databases on a Subject  The Invisible.
Click on the tab to find journals by Subjects. From the drop down menu, we will select Parasitology and Parasitic Diseases.
Using The Right Tools Information Searching by using the right tools. by Dolores Jordan August 1,2006.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Search Engines June 20, 2005 LIBS100 Linda Galloway.
Uncovering the Invisible Web. Back in the day… Students used to research using resources hand-picked by librarians and teachers. These materials were.
Mr. P’s Class Term Paper All the Steps on the Path to an “A” Term Paper in World History.
+ The Use of Databases in the Instructional Program Increasing Rigor and Inquiry Throughout the Curriculum Donna Dick, Jacob Gerding, and Michelle Phillips.
Web Search Architecture & The Deep Web
English 1213 Dr. Collins Session 4 Defining the World Wide Web & Web Resources.
Comprised by Mrs. Goodwin Search engines based on subject-area: Please note: I’ve included a summary beside each search engine listed… in color.
Databases vs the Internet. QUESTION: What is the main difference between using library databases and search engines? ANSWER: Databases are NOT the Internet.
To find journals by language of publication, click on the Languages bar in the horizontal frame. The Languages drop down menu appear and we will choose.
WISER Humanities: Quality Information on the Internet Johanneke Sytsema Linguistics Subject Consultant Judy Reading Reader.
HINARI/Health Information on the Internet (module 1.3 Part A)
W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Finding quality information.
Year 12: Workshop 2: Finding and evaluating information LSE Library / CLT / Widening Participation This work is licensed under a Creative Commons Attribution-NonCommercial.
Information Literacy Learn to find and critically evaluate information sources. Increase your information literacy skills, to more effectively search,
Chapter 9.  Personal Knowledge & Experience  Select familiar topics ▪ Personal knowledge is good support ▪ Examples, illustrations, explanations ▪ From.
Lecture 4 Access Tools/Searching Tools. Learning Objectives To define access tools To identify various access tools To be able to formulate a search strategy.
Searching the Web for academic information Ruth Stubbings.
Semmelweis University Library1 Finding Things - Hard To Find Finding Things - Hard To Find Jozsef GEGES PhD Ovidius Marketing Co Ltd.
Diversity in digital libraries
HINARI/Health Information on the Internet (module 1.3 Part A)
Webliography: URLs in Lecture04 Diversity
Web searching & the invisible web
PHARM Library Orientation
Presentation transcript:

Principles of Searching © Tefko Saracevic1 Web searching & the invisible web Finding things that are hard to find

Principles of Searching © Tefko Saracevic2 Dictionary definitions World wide web : Internet-connected files the very large set of linked documents and other files located on computers connected through the Internet and used to access, manipulate, and download data and programs Invisible - dictionary definition: not easily noticed; not noticed or detected readily Invisible web – not yet in the dictionary

Principles of Searching © Tefko Saracevic3 What is “Invisible web?” Materials that general search engines cannot or WILL not include in their collection of web pages (indexes)  You cannot find through general search engines Contains a vast amount of information resources  much of it authoritative & higher quality than visible web  quality becomes a main issue  much of it specialized  a lot of it also fluid or streaming or real time  “You can’t step in the same river twice”  much of it free Many times larger than the visible Web

Principles of Searching © Tefko Saracevic4 in other words… There is much more to the web than or Distribution of use:

Principles of Searching © Tefko Saracevic5 Why search engines do not cover all? Size: web is huge, cannot cover all Economics: associated costs are high  engines support themselves mostly by ads  also a number of engines have rank per pay & crawl update per pay - providing paid listings first & mostly Technical: still a challenge & limited capabilities  also some file formats hard to cover Spam: eliminating bad also looses good Restrictions: some site do not let in Deep structure: some sites complex

Principles of Searching © Tefko Saracevic6 How do search engines work? Main parts Crawlers, spiders: go out to find content  looking for new & changed sites  periodic, not for each query  no search engine works in real time Organizing content: labeling, arranging  indexing for searching or classifying as directory Databases, caches: storing content Retrieval engine: searching on basis of query Interface: handles query, displays results All based on various, mostly proprietary algorithms

Principles of Searching © Tefko Saracevic7 Search engine coverage No engine covers more than a fraction of WWW  estimates: none more than 16% Hard (impossible) to discern & compare coverage Many national search engines  own coverage, orientation, governance Many topical or domain search engines  own coverage geared to subject of interest Many comprehensive sources independent of search engines  some compilations of evaluated web sources

Principles of Searching © Tefko Saracevic8 Search engines differ Substantial differences among search engines on each of these parts  Need to know how they work & differ Information about search engines :  Search Engine Watch Search Engine Watch  ratings, news, statistics, charts, explanations, tutorials  Search Engine Showdown Search Engine Showdown  “The users’ guide to web searching” - run by a librarian, news links, ratings

Principles of Searching © Tefko Saracevic9 Invisible web searching: Basic approach The first step in determining the best approach for searching the invisible web is to have a clear idea of what you’re seeking  extensive user modeling Limit your search to appropriate resources & tools for the particular type of information you’re looking for  know your sources  know how to find appropriate sources  shades of “Knowledge is of two kinds…”

Principles of Searching © Tefko Saracevic10 Specialized sources - particularly for the invisible web The rest of the lecture covers: 1. Meta search engines 2. Specialized engines & catalogs 3. Domain (subject) engines & catalogs 4. Reference sources 5. Libraries as web sources 6. Virtual libraries 7. Subject databases 8. Societies, organizations 9. Good old books

Principles of Searching © Tefko Saracevic11 Meta search engines Meta search engines search multiple engines  getting combined results from a variety of engines Finding a search engine or meta engine:  SearchEngines.com SearchEngines.com search for engines by topic, geography, reference  Search Engine Guide Search Engine Guide  engines categorized by topic; other engine information  Search Engine Colossus Search Engine Colossus  international directory of search engines by country, topic from 198 countries and 61 territories; engines in choice of languages

Principles of Searching © Tefko Saracevic12 Sample of meta engines Some meta engines provide organized results: Dogpile results from a number of leading search engines; gives source, so overlap can be compared; (has also a (bad) joke of the day) Surfwax gives statistics and text sources & linking to sources; for some terms gives related terms to focus Teoma results with suggestions for narrowing; links resources derived; originated at Rutgers Turbo10 provides results in clusters; engines searched can be edited

Principles of Searching © Tefko Saracevic13 meta search engines (cont.) Large directory  Complete Planet Complete Planet  directory of over 70,000 databases & specialty engines Results with graphical displays  Vivisimo Vivisimo  clusters results; innovative  Webbrain Webbrain  results in tree structure – fun to use Kartoo results in display by topics of query

Principles of Searching © Tefko Saracevic14 Domain engines & catalogs Cover general & specific subjects  Open Directory Project Open Directory Project  large edited catalog of the web – global, run by volunteers  BUBL LINK BUBL LINK  selected Internet resources covering all academic subject areas; organized by Dewey Decimal System – from UK  Profusion Profusion  search in categories for resources & search engines Resource Discovery Network Resource Discovery Network – UK “ UK's free national gateway to Internet resources for the learning, teaching and research community”

Principles of Searching © Tefko Saracevic15 domain engines … Available in variety of domains & subjects – rich!  Think Quest – Oracle Education Foundation Think Quest  education resources, programs; web sites created by students  All Music Guide All Music Guide  resource about musicians, albums, and songs  Internet Movie Database Internet Movie Database  treasure trove of American and British movies Genealogy links and surname search engines well.. that is getting really specialized (and popular)

Principles of Searching © Tefko Saracevic16 domain engines … Scholarship, science  Psychcrawler - Amer Psychological Association Psychcrawler  web index for psychology  Entrez PubMed – Nat Library of Medicine Entrez PubMed biomedical literature from MEDLINE & health journals  CiteSeer - NEC Research Center CiteSeer  scientific literature, citations index; strong in computer science Scholar Google searches for scholarly articles & resources Infomine scholarly internet research collections

Principles of Searching © Tefko Saracevic17 Reference services Reference services - several models  Ask Jeeves! Ask Jeeves!  most popular, commercial  Information Please Information Please  almanac type questions RefDesk access to a number of reference tools Wikipedia web encyclopedia in many languages Martindale’s The reference Desk probably the most amazing & versatile reference collection on the web – numerous sections, great to explore

Principles of Searching © Tefko Saracevic18 reference … Digital reference - new service area for libraries  QuestionPoint L of Congress & OCLC QuestionPoint  project for a global reference network  Virtual Reference Desk – L of Congress Virtual Reference Desk  large compilation of web reference sites  LiveRef - maintained at Iowa State U LiveRef  a registry of real time digital reference services Martindale’s The reference Desk probably the most amazing & versatile reference collection on the web – numerous sections, great to explore

Principles of Searching © Tefko Saracevic19 Libraries as web sources Academic, national libraries providing open collections & services; models vary  Rutgers libraries - big long term effort Rutgers libraries  University of California, Berkeley University of California, Berkeley  a most elaborate effort together with Sun Corporation LibWebLibWeb U California, Berkeley “ lists currently over 7200pages from libraries in over 125 countries”  Bibliothèque Nationale de France Bibliothèque Nationale de France  includes virtual exhibitions, among others

Principles of Searching © Tefko Saracevic20 Virtual libraries on the Web Libraries emerging only on the Web  Virtual Library –Virtual Library  Switzerland, US, UK & other countries – ‘oldest virtual library on the Web’  Internet Public Library U of Michigan Internet Public Library  also a long term effort  Librarians Index of the Internet Librarians Index of the Internet  very popular and comprehensive Digital librarian “a librarian's choice of the best of the Web “ – compiled and annotated by a librarian

Principles of Searching © Tefko Saracevic21 virtual libraries …  Academic Info Digital Library Academic Info Digital Library  many links to digital collections & resources in various subjects  Gabriel Gabriel  Gateway to European National Libraries  Museum of online museums Museum of online museums  a delight Stanford Encyclopedia of Philosophy a comprehensive encyclopedia and library The historical New York Times Project universal library – ongoing digitization

Principles of Searching © Tefko Saracevic22 Subjects resources Many subject specific sites  rich & often unique coverage & services  different approaches & requirements Examples in health related domains:  WebMDHealth WebMDHealth  news, medical information  Rxlist Rxlist  The Internet Drug Index  Mayo Clinic HealthOasis Mayo Clinic HealthOasis  health advice Kidshealth sites for parents, kids, teens

Principles of Searching © Tefko Saracevic23 Subject resources … Scholarship, humanities, government  KIRKE - Katalog der Internetressourcen für die Klassische Philologie aus Erlangen KIRKE  German; a variety of resources for classics  Perseus Digital Library Tufts University Perseus Digital Library  covers antiquity to renaissance; one of the best subject sites on the web; affected the whole field  Sch of Slavonic & East European Studies, University College London Sch of Slavonic & East European Studies  includes country resources, e.g. Croatia  U Mich Document Center U Mich Document Center  official documents from all over the world

Principles of Searching © Tefko Saracevic24 Subject resources … Growing number of resources in arts, museums MuseumStuff.com “We have 1000's of museums, zoos, historical societies and related organizations in our database” The State Hermitage Museum One of the greatest museums in the world, and one of the best museum site – developed with IBM help National Museum of Science and Technology Leonardo da Vinci Guess where those pictures came from. A delight!

Principles of Searching © Tefko Saracevic25 subject resources … Diotima Materials for study of women and gender in the Ancient World Moving Images Collections “MIC documents moving image collections around the world.” Part particularly oriented toward science educators. Now at Library of Congress, but developed at Rutgers. And, of course … Snoopy The Official Peanuts Website

Principles of Searching © Tefko Saracevic26 Societies, organizations Many societies, agencies developed their sites  great many rich sources for searching & resources  differences in requirements, depth, richness  Assoc. for Computing Machinery Assoc. for Computing Machinery  Digital Library; subscription or registration or through RUL  US State Department US State Department  about the U.S & other countries FirstGov the US government official web portal Ocean Planet Ocean Planet NASA presentation of earth & its vast oceans ArXivArXiv Cornell U, National Science Foundation e-print service in the fields of physics, mathematics, non- linear science, computer science, and quantitative biology

Principles of Searching © Tefko Saracevic27 Archiving, books on the web Internet Archive – a large undertaking Internet Archive  includes web archive & lots more publicly available & free  10 billion web pages archived from 1996 to a few months ago  Wayback Machine – search to look at old versions of web pages Wayback Machine Books on the web Million Book Project digitizing books and providing free access International Children’s Digital Library online children books Digital books Index “ links to more than 105,000 title records from more than 1800 commercial and non-commercial publishers, universities, and various private sites”

Principles of Searching © Tefko Saracevic28 Language barriers on the Web English still the major language  but declining, now slightly over 50% Multilingual retrieval search engines  Euroseek Euroseek  searches in a number of languages  All the Web All the Web  results in 45 languages

Principles of Searching © Tefko Saracevic29 Web news; keeping up What is going on on the Web? Some major sources of news and evaluations:  Free Pint Free Pint  newsletter, articles, links; nice & sometimes quirky  Internet Resources Newsletter Internet Resources Newsletter UK based; monthly newsletter for “academics, students, engineers, scientists and social scientists”  ResearchBuzz ResearchBuzz daily updates; many aspects; “Collection of items on search engines, online databases, and other information resources”  About.com Web Search About.com Web Search  tools, Web Search Forum

Principles of Searching © Tefko Saracevic30 keeping up … Information Today trade & professional monthly newspaper & web site; industry news; searcher columns; general analyses of trends Keeping up through blogosphere:  Resource Shelf Resource Shelf bloger about internet (and some other stuff) with archive; it has really good and really bad exchanges & threads New York Times blogrunnerNew York Times blogrunner - The annotated NYT blog tracking of NYT articles, topics, authors; thread into discussion of many other weblogs; includes net & web topics

Principles of Searching © Tefko Saracevic31 Finding links & listings – back to good old books with a new twist Number of books on web searching have also sites with links in the book, updates, news  Extreme Searcher Randolph Hock Extreme Searcher  update of a popular book; links by chapter topics The web libraryThe web library Nicholas G. Tomaiuolo spotlights free resources, links by chapter and new topics – done by a librarian The invisible webThe invisible web Chris Sherman & Gary Price original book on the topic, links organized by subject p.s. most, but not all, of the sites in this lecture can be found on those sites – and much, much more

Principles of Searching © Tefko Saracevic32 Evaluations, ratings Evaluating web sites: a prime responsibility of searchers & all information professionals Many sources evaluate web sites:  The Scout Report – The Scout Report  librarians’ BIBLE! Annotations. Comprehensive.  Medical Library Association Medical Library Association  ten most useful sites for consumer health  MLA user guide MLA user guide  for finding & evaluating health information on the web  Web 100 Web 100  commercial, user ranking & evaluation of web sites  Evaluating web pages UC Berkeley Evaluating web pages tutorial and guide

Principles of Searching © Tefko Saracevic33 Needed for Web searching Knowledge & competencies on  variety of web sources & their organization  search engines  web search strategies  search dynamics, feedback Keeping up & up & up  Why? many reasons, such as:  constant updates, changes, innovations  many domain/subject specific  fluidity very high

Principles of Searching © Tefko Saracevic34 Needed for web searching by professionals Knowledge of SOURCES in area of interest  search engines not enough  not too helpful in finding these other sources; structure hard to discern  find & use specialized sources Evaluation of sources  a key professional skill!  application of standard criteria & web criteria : authority; accuracy; currency (timeliness); objectivity; coverage, persistence, usability

Principles of Searching © Tefko Saracevic35 Needed competencies … Knowledge of users & use Knowledge of searching Use of technology Adaptability, flexibility Integration with other resources Teaching others Constant learning & update  again: keeping up, keeping up, keeping up  and again: keeping up, keeping up, keeping up

Principles of Searching © Tefko Saracevic36 information WWW But now really: How to do it?

Principles of Searching © Tefko Saracevic37

Principles of Searching © Tefko Saracevic38

Principles of Searching © Tefko Saracevic39 Images from the invisible web

Principles of Searching © Tefko Saracevic40 images …

Principles of Searching © Tefko Saracevic41 images …

Principles of Searching © Tefko Saracevic42 and of course…

Principles of Searching © Tefko Saracevic43 P.S. a nice site Poem by Emily Dickinson, In a library Who will write a poem: In a digital library ??????

Principles of Searching © Tefko Saracevic44 P.S. a few weird or fun sites… SelectSmart.com  all kinds of quizzes for you James Dean official web site Deaducated  Dead Librarians’ Society Livejournal  blogs & authoring tools; and many pathetic entries Airline meals  “the world’s first and leading site about nothing but airline food” … some 12,000 pictures from 447 airlines  it is not weird, but for real and great fun

Principles of Searching © Tefko Saracevic45 Sources About.com Web Search Academic Info Digital Library Airline meals All the Web Ask Jeeves! Assoc. for Computing Machinery Bibliothèque Nationale de France BUBL LINK CDNET Search.com CiteSeer CompletePlanet Deaducated Digital book index Digital librarian Diotima Dogpile Entrez PubMed Extreme Searcher Free Pint Gabriel Genealogy

Principles of Searching © Tefko Saracevic46 sources … Hermitage Information Please International Children’s Digital Library Internet Archive Internet Public Library, Michigan Internet Resources Newsletter. James Dean Kartoo KIRKE Leonardo da Vinci Museum Librarians Index to the Internet Live Journal LiveRef Martindale’s The reference Desk Mayo Clinic Medical Library Assoc. ten top sites Medical Library Assoc. user guide for health inf. Medscape

Principles of Searching © Tefko Saracevic47 sources … Million Book Project Museum of online museums. MuseumStuff NYT blogrunner NYT historical project OCLC Web Characterization Project Open Directory Project Perseus Digital Library Profusion Psychcrawler QuestionPoint ResearchBuzz. Resource Shelf Rutgers Libraries RxList Sch of East Eur & Slavonic Studies Search Engine Colossus Search Engine Guide Search Engine Showdown

Principles of Searching © Tefko Saracevic48 sources … Search Engine Watch Select Smart.com Snoopy Stanford Encyclopedia of Philosophy Surfwax Teoma The invisible Web The Scout Report. The Web Library Think Quest Turbo10 U California Berkeley U Mich Documents Center US State department Virtual Library Virtual Reference Desk Vivisimo Web Webbrain WebMD Wikipedia