Exploring the Deep Web Peter L. Kraus J. Willard Marriott Library – University of Utah.

Slides:



Advertisements
Similar presentations
1 of 16 Information Access The External Information Providers © FAO 2005 IMARK Investing in Information for Development Information Access The External.
Advertisements

Web Archives and Large-Scale Data: Preliminary Techniques for Facilitating Research Nicholas Woodward Latin American Network Information Center
Usage Statistics in Context: related standards and tools Oliver Pesch Chief Strategist, E-Resources EBSCO Information Services Usage Statistics and Publishers:
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Electronic Library and Information Resources Introduction and overview.
Google Series Part 1: gmail Part 2: maps Part 3: talk Part 4: earth Part 5: books Part 6: picasa Part 7: sites Part x: ?
13 February 2009ESDS – whats in it for librarians? Royal Statistical Society The strange case of the local data librarian - a peculiarly Edinburgh perspective!
A Guide to PMCID numbers Anca Geana, MBA, CRA – May 2012.
NIH Public Access Compliance Cleveland Health Sciences Library Case Western Reserve University Kathleen C. Blazar.
Finding information resources : Physics Richard Holmes November 2013.
EDUCATION DATABASES: OVERVIEW. Primary Journal Databases Available for Education Education specific: ProQuest Education Journals Professional Development.
CrossRef Linking and Library Users “The vast majority of scholarly journals are now online, and there have been a number of studies of what features scholars.
Sunday October 28, www.eprints.org Tim Brody - Stevan Harnad -
Media Center Essential Question How can I be an effective user of information?
This PowerPoint presentation and handouts are posted under “Library Classes” on library website.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Finding Sources Introduction Types of sources Locating sources Online card catalogues Search engines Online databases Talk About It Your Turn Tech Tools.
Information Literacy Defined A set of abilities that requires individuals: recognize what information is needed have the ability to locate, evaluate,
The Visible Web (aka The Surface Web or Indexable Web)
Exploring the Deep Web Brunvand, Amy, Kate Holvoet, Peter Kraus, and David Morrison. "Exploring the Deep Web." PPT--Download University of Utah.
Tara Guthrie, 2012 Types of Resources: Electronic.
Exploring the Academic Invisible Web Das wissenschaftliche Invisible Web erkunden Dr. Dirk Lewandowski Heinrich-Heine-Universität Düsseldorf, Information.
Highlights from the Open Access Timeline (1) 1971, Project Gutenberg launched on the Internet (originally as an FTP site). There are now 18,000 free books.
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
The Internet vs. The Online Database What’s the difference?
Online Resources From Oxford University Press This presentation gives a brief description of Oxford Journals. It tells you: what the journals are; how.
Urban Growth and Structure Kreg Walvoord And Hillary Campbell.
Literature in Theory & Practice Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science, University of Iowa Instructional.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
MADGIC is… MAPS and ATLASES DATA: NUMERIC and GEOSPATIAL (for use with special software) GOVERNMENT INFORMATION (parliamentary and other official reports,
English 115 Web Site Evaluation Hudson Valley Community College Marvin Library Learning Commons 1.
MADGIC is… MAPS and ATLASES DATA (NUMERIC and GEOSPATIAL) for use with special software GOVERNMENT INFORMATION (parliamentary and other official reports,
Beyond the Basics Steven Butzel, Nashua Public Library , Yahoo IM: nashuaref.
OpenURL Link Resolvers 101
Concepts and phrases From ODLIS (Online Dictionary of Library and Information Science)
1 Public Relations Library Instruction Public Relations Library Instruction Christine Adams Business & Economics Librarian Phone: (330)
Finding Credible Sources
5 Marzo 2007 Census mapping and Gis Part II: dissemination Fabio Crescenzi Istat, Central Directorate on General Censuses UNECE Training Workshop on Census.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
1 UNOG Library Digitization and Microform Unit (DMU) – December 2009.
Internet Overview Data Service Center What is the Internet? F A network of networks connecting computers/people around the world allowing them to share.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Huda AL-Omairl - Network91 The Internet. Huda AL-Omairl - Network92 What is Internet? The world’s largest computer network, consisting of millions of.
Researching the African Diaspora and Creolité on the Internet Karen Hartman Information Resource Officer U.S. Embassy, Nairobi, Kenya February 5, 2008.
Using EBSCOhost databases Access via MyAthens Click on the EBSCOhost link.
Open Access - an introduction, Aleppo, December Open Access – an introduction Ian Johnson.
Teaching students how to be effective users of Information
Fifty Shades of Grey Literature: A Primer for Biomedical Librarians Ahlam A. Saleh, Research Librarian Arizona Health Sciences Library University of Arizona.
Daniel Boivin OCLC Canada OCLC and Access98. AgendaAgenda n What’s new with FirstSearch 4.0 n New FirstSearch or FirstSearch 5.0.
Uncovering the Invisible Web. Back in the day… Students used to research using resources hand-picked by librarians and teachers. These materials were.
Current Information To help you find current news and information, many search engines and directories include a hyperlink to a "What's new" page. Many.
Advancing Science: OSTI’s Current and Future Search Strategies Jeff Given IT Operations Manager Computer Protection Program Manager Office of Scientific.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Agency on Statistics of the Republic of Kazakhstan A strategy for the dissemination of statistical information: the Kazakhstan experience.
The Deep Web March 2, What is the Deep Web Aka the Invisible Web – Contents from thousands of specialized, searchable databases – Contents from.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
Digitalcommons.unl.edu Archiving Department Records.
Using Google Scholar Ronald Wirtz, Ph.D.Calvin T. Ryan LibraryDec Finding Scholarly Information With A Popular Search Engine Tool.
Databases vs the Internet
Using Open Access to Increase Personal Internet Presence
Institutional Repository and Friends
Looking for information?
A Brief Introduction to the Internet
Search Engines & Subject Directories
New Faculty Orientation – February 26, 2018
using the internet for research
How To Do a Research Report
Search Engines & Subject Directories
Search Engines & Subject Directories
Presentation transcript:

Exploring the Deep Web Peter L. Kraus J. Willard Marriott Library – University of Utah

What is the Deep Web? The deep Web is the hidden part of the Web, containing a huge volume of content that is inaccessible to conventional search engines, and consequently, to most users.

How big is the Deep Web? 550 billion documents 500 times the content of the surface Web Google has identified 1.2 billion documents An Internet search typically searches.03% (1/3000) of available content.

Whats in the Deep Web? Searchable databases Downloadable files & spreadsheets Image and multi-media files Data sets Various file formats such as.pdf Lots of government information

Why use the Deep Web? Higher quality sources –Selected and organized by subject experts Dynamic display Customized data sets Some data is visual, and not word searchable Regular search engines miss vast resources available in the Deep Web

Why are we talking about Government Sites in the Deep Web? Governments have the mandate and the capacity to gather information that individuals dont Most government information is copyright free Government information is authoritative Governments have the financial and human resources to maintain Deep Web sites

The Web Today Web sites from the federal government only occupy about 1% of the entire global web. However, they hold 85% of The Deep Web. The content of these web sites include items with either an.html or.pdf format (reports, records, data-sets, etc) – diversity of files. Little standardization or uniformity ; Common term for this content is Grey Literature.

Definition of Grey Literature That which is produced on all levels of government, academics, business and industry in print and electronic formats, but which is not controlled by commercial publishers

Growth and Life of Federal Information On federal web sites the amount of information grew 13-fold between The average life expectancy of federal web resource is 4 months (2003)

What can libraries do? LOCKSS-DOCS project (BYU and UU are members) (Archival project) Cooperative efforts in specific subject areas (Western Waters Digital Library) Individual Institutional Initiatives; such as Institutional Repositories ; reflecting the institutional productivity in research (Information often funded by federal grants)

Finding Naked People - Forsyth, Fleck (1996)Finding Naked People - Forsyth, Fleck (1996) (Correct) (54 citations)(Correct)(54 citations) This paper demonstrates an automatic system for telling whether there are naked people present in an image. The approach combines color and texture properties to obtain a mask for skin regions, which is shown to be effective for a wide range of shades and colors of skin. http.cs.berkeley.edu/~daf/newo2.ps.Z

Graph showing number of citations to Finding Naked People

Arches National Park : NASA Landsat 7 10/3/99

searching for ""University of Utah"" displaying records of a total of 27 next 25last 25 Development and Evaluation of Stitched Sandwich Panels Larry E. Stanley; Daniel O. Adams NASA Langley Research Center NASA/CR , June 2001; ….. test panels were produced initially at the University of Utah and later at NASA Langley Research Center…… NASA-2001-cr pdf

Marriott Library, Salt Lake City, Utah, United States 9/18/2003 (TerraServer)

Utah Seismic Hazards (National Atlas)

International Deep Web Resources International organizations collect an amazing amount of data Statistical data is often best organized in database and spreadsheet format Like the US Government, individual countries post data files and databases This information may not be available in print sources in schools and libraries

United Nations Official Documents System

Why use the ODS? Full-text Official United Nations Documents (1993 -) online, free Retrospective digitization in process Highly relevant material for almost any international topic Timely and authoritative

United Nations Statistical Databases Value of the information: –Authoritative –Comparative –Time series –Compact Database topics include: Commodity trade Demographics Disability statistics Social indicators Statistics on men and women

Individual Country Statistics

Why use this kind of information? Aggregate statistical sources are often not as up-to-date Individual countries are often more specific in their indicators than aggregate sources Information in databases, spreadsheets, and downloadable files is usually NOT searchable by web crawlers

For Further Information Marriott Library, University of Utah