8/31/2000Information Organization and Retrieval What is Information? The Nature, Growth and Characteristics of Information University of California, Berkeley.

Slides:



Advertisements
Similar presentations
Student Research Center & eLibrary Created by: Tisha A. Tytar Oakwood High School Fall 2008.
Advertisements

What is Plagiarism? buying, stealing, or borrowing a paper (including, of course, copying an entire paper or article from the Web) hiring someone to write.
Audio and Visual Technologies
What is Information? The Nature, Growth and Characteristics of Information University of California, Berkeley School of Information Management and Systems.
1 University of Palestine Information technology college Electronic Document Management System Technologies Electronic Document Management System Technologies.
Technological Convergence for Institutions & Audiences
What Is A Computer System?
Discovering Computers: Chapter 1
SLIDE 1IS Fall 2002 Lecture 02: Info/History/Photo Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am.
8/31/2000Information Organization and Retrieval What is Information? The Nature, Growth and Characteristics of Information University of California, Berkeley.
Definition and Aspects
Orientation to Libraries Research Methods and Data College of Advancing Studies Brendan Rapple.
11/21/2000Information Organization and Retrieval Thesaurus Design and Development University of California, Berkeley School of Information Management and.
SLIDE 1IS FALL 2004 Lecture 02: Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002
SLIDE 1IS146 – SPRING 2005 How a Telephone and Telephone Network Work Prof. Marc Davis, Prof. Peter Lyman, and danah boyd UC Berkeley SIMS Tuesday.
SLIDE 1IS FALL 2003 Lecture 02: Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
Introduction to Computers Essential Understanding of Computers and Computer Operations.
Introduction to Information Technology: Your Digital World © 2013 The McGraw-Hill Companies, Inc. All rights reserved.Using Information Technology, 10e©
How the World Wide Web Works
Internet Basics مهندس / محمد العنزي
A Seminar report On Electronic Resources :An Overview
Research Methods & Data AD140Brendan Rapple 2 March, 2005.
IT-101 Section 001 Introduction to Information Technology Lecture #1.
Teaching and Learning with Technology Click to edit Master title style  Allyn and Bacon 2002 Teaching and Learning with Technology Click to edit Master.
Introduction to Computers
Unit 1, Lesson 2 Introduction to Digital Video & Digital Media AOIT Digital Video and Digital Media.
Information Seeking Behavior
What is Communication?.
Computer Memory Chips Vs. Human Memory Computer Memory Chips Vs. Human Memory Agenda.Introduction.What does ( memory ) mean ?.Brain memory V.S computer.
So Much Data exabytes per year; 250MB/yr per person on earth (phrased as “everyone on earth writes.
INTRODUCTION TO COMPUTING
CHAPTER 7 Storage Katie Moody Storage Storage holds data, instructions, and information for future use. Every computer uses storage to hold software.
August 12, 2004IAML - IASA 2004 Congress, Olso1 Music Information Retrieval, or how to search for (and maybe find) music and do away with incipits Michael.
What is a Computer ? What is the application of computer in Our Daily Life ? What is the application of computer in Teaching Field?
Where are we?. Assignments Library map assignment Biography Defining the Research Question (Tutorial 1) Organization Paragraph Netiquette Quiz.
Analogue vs Digital. Analogue  Lots of different frequencies, lots of different amplitudes  Wave recorded as it is.
I.T MEDIA MAISRUL www.roelsite.yolasite.com
 The World Wide Web is a collection of electronic documents linked together like a spider web.  These documents are stored on computers called servers.
Digital Literacy Lesson 3. The Role of Memory A computer stores data in the memory when a task is performed. Data is stored in the form of 0s and 1s.
Multimedia ITGS. Multimedia Multimedia: Documents that contain information in more than one form: Text Sound Images Video Hypertext: A document or set.
CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway
Technology in the Classroom Continue. So what is technology in the classroom? Click on a picture! END.
 Identify computer system components.  Explain how the CPU works.  Differentiate between RAM and ROM.  Describe how data is represented.  Identify.
Lecture # 30 Data Organization and Binary Search.
Information Retrieval Techniques Israr Hanif M.Phil QAU Islamabad Ph D (In progress) COMSATS.
Structure of IR Systems INST 734 Module 1 Doug Oard.
MEDIA refers to a single medium used to communicate any data for any purposemedium a "one to many" form of communication, whereby products are mass produced.
MLA Style A Guide to Citing Sources First things first: What is a citation? MLA citation style Why you need to cite your sources How to cite your sources.
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
Media, Multimedia & Digital Media Basic Concepts.
Floppy Disk Drive Lesson 5 CES Industries, Inc.. 1. Evolved from audio tape to floppy disk drives, with the first being an 8” disk to modern 3 1/2” 2.
Unit 8 Seminar Seminar Question: Is the World Wide Web a new medium with a new kind of message, or is it only another channel for the same kind of information.
Windows XP Lab 2 Organizing Your Work Competencies.
Communications Introduction Mr. Hennessy/Mr. DiMeglio Uxbridge High School 1/08.
Digital Media Content MCD 7213 Development. Presentation outline What is media What is DIGITAL media? What is DIGITAL content? Traits of digital content.
What is Communication? Güven Selçuk.
Discovering Computers 2008 Fundamentals Fourth Edition Discovering Computers 2008 Fundamentals Fourth Edition Chapter 1 Introduction to Computers.
THE ROLE OF MASS MEDIA IN FORMING A PERSONALITY. The mass media are all those media technologies that are intended to reach a large audience by mass communication.
Authority and Credibility. The Information Explosion & The Growth of knowledge With exponential increases in the size of the literature we can read only.
Computers Mrs. Flowers University High School.
University of California, Berkeley
Unit 1, Lesson 2 Introduction to Digital Media
Searching for and Accessing Information
INTRODUCTION TO INFORMATION TECHNOLOGY Your Digital World
Advanced Information Retrieval
Unit# 5: Internet and Worldwide Web
Search and Retrieval in a Virtual World
TECHNOLOGICAL CONVERGENCE for Institutions & Audiences
Information Retrieval and Web Search
Presentation transcript:

8/31/2000Information Organization and Retrieval What is Information? The Nature, Growth and Characteristics of Information University of California, Berkeley School of Information Management and Systems SIMS 202: Information Organization and Retrieval

8/31/2000Information Organization and Retrieval What is Information? There is no “correct” definition Can involve philosophy, psychology, signal processing, physics Cookie Monster’s definition: – “news or facts about something” Oxford English Dictionary –information: informing, telling; thing told, knowledge, items of knowledge, news –knowledge: knowing familiarity gained by experience; person’s range of information; a theoretical or practical understanding of; the sum of what is known

8/31/2000Information Organization and Retrieval Assignment 1 What is information, according to your background or area of expertise?

8/31/2000Information Organization and Retrieval Types of Information Differentiation by form. Differentiation by content. Differentiation by quality. Differentiation by associated information.

8/31/2000Information Organization and Retrieval Information Properties Information can be communicated electronically –Broadcasting –Networking Information can be easily duplicated and shared –Problems of Ownership –Problems of Control Adapted from ‘Silicon Dreams’ by Robert W. Lucky

8/31/2000Information Organization and Retrieval Intuitive Notion (Losee 97) Information must –Be something, although the exact nature (substance, energy, or abstract concept) is not clear; –Be “new”: repetition of previously received messages is not informative –Be “true”: false or counterfactual information is “mis- information” –Be “about” something This human-centered approach emphasizes meaning and use of message

8/31/2000Information Organization and Retrieval Information from the Human Perspective Levels in cognitive processing –perception –observation/attention –reasoning, assimilating, forming inferences Knowledge: justified true belief Belief: an idea held based on some support; an internally accepted statement, result of inductive processes combining observed facts with a reasoning process Does information require a human mind? –Communication and information transfer among ants –A tree falls in the forest … is there information there? –Existence of quarks

8/31/2000Information Organization and Retrieval Meaning vs. Form Form of information as the information itself Meaning of a signal vs. the signal itself –What aspects of a document are information? Representation (Norman 93) –Why do we write things down? Socrates thought writing would obliterate serious thought Sounds and gestures fade away –Artifacts help us to reason –Anything not present in the representation can be ignored –Things left out of the representation are often what we don’t know how to represent

8/31/2000Information Organization and Retrieval Information Hierarchy Wisdom Knowledge Information Data

8/31/2000Information Organization and Retrieval Information Hierarchy Data –The raw material of information Information –Data organized and presented by someone Knowledge –Information read, heard or seen and understood Wisdom –Distilled and integrated knowledge and understanding

8/31/2000Information Organization and Retrieval Information Where is the Life we have lost in living? Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information? -- T.S. Eliot, “The Rock” Where is the information we have lost in data?

8/31/2000Information Organization and Retrieval Origins Very early history of content representation –Sumerian tokens and “envelopes” –Alexandria - pinakes –Indices

8/31/2000Information Organization and Retrieval Origins Biblical Indexes and Concordances –Hugo de St. Caro – 1247 A.D. : 500 Monks -- KWOC –Book indexes (Nuremburg Chronicle) Library Catalogs Journal Indexes “Information Explosion” following WWII –Cranfield Studies of indexing languages and information retrieval –Development of bibliographic databases Index Medicus -- production and Medlars searching

8/31/2000Information Organization and Retrieval Information Theory Claude Shannon, 1940’s, studying communication Ways to measure information –Communication: producing the same message at its destination as that seen at its source –Problem: a “noisy channel” can distort the message Between transmitter and receiver, the message must be encoded Semantic aspects are irrelevant Noise Channel Receiver Desti- nation Message source Trans- mitter

8/31/2000Information Organization and Retrieval Information Theory Better called “Communication Theory” Communication may be over time and space Noise SourceDecodingEncodingDestination Message Channel StorageSource Decoding (Retrieval/Reading) Encoding (writing/indexing) Destination Message

8/31/2000Information Organization and Retrieval What kinds of information are there? Text –books, periodicals, WWW, memos, ads –published/refeered Film Photos, other Images Broadcast TV, Radio Telephone Conversations Databases

8/31/2000Information Organization and Retrieval How much information is there? (Estimates courtesy Hal Varian and Peter Lyman:

8/31/2000Information Organization and Retrieval How Much Information? Stored Information –Print –Film –Optical –Magnetic Communicated –Internet –Broadcast –Phone –Mail

8/31/2000Information Organization and Retrieval Print Annual Production –Books 968,735 = 8 Terabytes (compressed image) –Newspapers = 25 Terabytes –Journals = 2 Terabytes –Magazines = 10 Terabytes –Office Documents 12x10^9 pages = 312 Terabytes –TOTAL 357 Terabytes (1824 scanned, 35 text)

8/31/2000Information Organization and Retrieval Print Library of Congress Printed book collection –About 18 Million books –About 130 Terabytes (compressed image) –For all of LC we should also assume 13M photographs, 5MB each = 65 TB 4M maps, say 200 TB 500K files, 1GB each = 500 TB 3.5M sound recordings, ~2000 TB Grand total: 3 petabytes (~3000 terabytes) Books in Print –3.2 Million titles –About 26 Terabytes

8/31/2000Information Organization and Retrieval Film and Image Film –Photographs = 410 Petabytes per year –Movies = 16 Terabytes (Commercial Production of about 4000 films) –X-Rays = 12 Petabytes

8/31/2000Information Organization and Retrieval Optical Media CD-Music 90,000 items = 58 TB CD-ROM 3,000 items = 3 TB DVD-Video 5,000 items = 22 TB Total 83 TB

8/31/2000Information Organization and Retrieval Magnetic Media Audio Tape 184,200,000 = Petabytes Video Tape 355,000,000 = 1420 Floppy disks = 0.07 Removable disks = 1.69 Hard Disks = 500

8/31/2000Information Organization and Retrieval Totals Stored Per Year Medium Type of content Terabytes/Year Terabytes/Year Upper Bound Lower Bound Paper Books 8 7 Newspapers Periodicals Office documents SUBTOTAL Film Photographs 410, ,000 Cinema X-Rays 12,000 12,000 SUBTOTAL 422, ,016 Optical Music CDs Data CDs 3 3 DVDs SUBTOTAL Magnetic Camcorder 300, ,000 Disk drives 2,555,000 1,000,20 SUBTOTAL 2,855,000 1,300,200 TOTAL 3,277,440 1,412,632

8/31/2000Information Organization and Retrieval Internet Traffic -- Historical Nov ‘92 Apr ‘95 Dec 1996 = 1500Tb Dec 1997 = 3000Tb

8/31/2000Information Organization and Retrieval Internet Traffic Nov ‘92Apr ‘95

8/31/2000Information Organization and Retrieval Current Size of Web There are an estimated 2.1 Billion pages on the Web –About 21 Terabytes –About 7500 further Terabytes in web-accessed DBs. 610 Billion messages per year = TB Internet Traffic is doubling every 100 days - An estimated 62 Million Americans now use the internet (US Commerce Dept 1998) Radio took 38 years to get 50 M listeners, TV took 13 years, the Net took 4 years...

8/31/2000Information Organization and Retrieval Internet - Recent Statistics 5 M Level 2 Domains (NW June 1999) 43.2 Million Hosts (NW January 1999) 206/246 IP countries (NW July 1998) 300 Million Users (Newsbytes, Mar 2000) (830 Million Telephone Terminations) Source: Vint Cerf

8/31/2000Information Organization and Retrieval Internet Hosts (000s) Source: Vint Cerf

8/31/2000Information Organization and Retrieval Projected Voice and Data Traffic Gb/s Source: America's Network, May 15, 1998

8/31/2000Information Organization and Retrieval Users on the Internet - May 1999 CAN/US M Europe M Asia/Pac M Latin Am M Africa M Mid-east M Total - 165M Source: Vint Cerf

8/31/2000Information Organization and Retrieval Language Distribution of Web Content Source: Jack Xu: Excite

8/31/2000Information Organization and Retrieval Language Distribution on a 634 Million Web Pages Corpus

8/31/2000Information Organization and Retrieval Sources on Information, Computer, and Network Use www/numbers.html –Statistical snippets extracted from the news up/ –Vint Cerf’s pages man/index.html –The size and growth rate of the Internet by K.G. Coffman and Andrew Odlyzko

8/31/2000Information Organization and Retrieval Human Memory –Landauer 86: Human brain holds 200MB looked at rate of information intake and rate of forgetting, and amount of information adults need for normal tasks –6B people on earth implies total memory of all people alive about 1,200 petabytes –Another way: estimate that people take in a byte/sec lifetime 250,000 days or 2B sec result is 2 GB (doesn’t count synthesizing new info)

8/31/2000Information Organization and Retrieval Information Overload “The greatest problem of today is how to teach people to ignore the irrelevant, how to refuse to know things, before they are suffocated. For too many facts are as bad as none at all.” (W.H. Auden)