WIRED Future Quick review of Everything What I do when searching, seeking and retrieving Questions? Projects and Courses in the Fall Course Evaluation.

Slides:



Advertisements
Similar presentations
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Advertisements

1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
A PowerPoint Presentation
Chapter 5: Introduction to Information Retrieval
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
Search Engines and Information Retrieval
Architecture of a Search Engine
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Information Retrieval in Practice
Information Access Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies Design Understanding.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
CS580: Building Web Based Information Systems Roger Alexander & Adele Howe The purpose of the course is to teach theory and practice underlying the construction.
1 CS/INFO 430 Information Retrieval Lecture 23 Usability 1.
Overview of Search Engines
Definitions Collaboration – working together on team projects and sharing information, often through ad-hoc processes, to accomplish project goals. Document.
ICTLIP Module 3. Information Seeking in An Electronic Environment
Databases & Data Warehouses Chapter 3 Database Processing.
Automated Tracking of Online Service Policies J. Trent Adams 1 Kevin Bauer 2 Asa Hardcastle 3 Dirk Grunwald 2 Douglas Sicker 2 1 The Internet Society 2.
With Internet Explorer 9 Getting Started© 2013 Pearson Education, Inc. Publishing as Prentice Hall1 Exploring the World Wide Web with Internet Explorer.
Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University.
Web 2.0: Concepts and Applications 4 Organizing Information.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Search Engines and Information Retrieval Chapter 1.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Tutorial 1: Browser Basics.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Evaluating IR (Web) Systems Study of Information Seeking & IR Pragmatics of IR experimentation The dynamic Web Cataloging & understanding Web docs Web.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Information Architecture & Design Week 6 Schedule -Group Project Plan Due -Browsing and Searching for IA -Other Readings -Research Topic Presentations.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Information Retrieval
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
KMS & Collaborative Filtering Why CF in KMS? CF is the first type of application to leverage tacit knowledge People-centric view of data Preferences matter.
Documents and Indexing Readings Overview Topic Discussions Schedule Set Projects and Papers Ideas.
WIRED Week 6 Syllabus Review Readings Overview Search Engine Optimization Assignment Overview & Scheduling Projects and/or Papers Discussion.
WIRED Week 5 Readings Overview - Text & Multimedia Languages & Properties - Text Operations - Multimedia IR Finalize Topic Discussions Schedule Projects.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Getting Your Content in the Penn State Student Portal Presented By James Leous, Program Manager James Vuccolo, Lead Research Programmer.
Information Architecture & Design Week 6 Schedule - Browsing and Searching for IA - Other Readings - Research Topic Presentations - Class Work (if time)
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
Data mining in web applications
Information Retrieval in Practice
Information Retrieval in Practice
Search Engine Architecture
Augmenting (personal) IR
WIRED Week 2 Syllabus Update Readings Overview.
Presentation transcript:

WIRED Future Quick review of Everything What I do when searching, seeking and retrieving Questions? Projects and Courses in the Fall Course Evaluation

WIRED Focus Information Retrieval: representation, storage, organization of, and access to information items Focus is on the user information need User information need: - Find all docs containing information on Austin which: Are hosted by utexas.edu Discuss restaurants Emphasis is on the retrieval of information (not data, not just a keyword match)

Documents Information Need index query Rankingmatch documents ? Quick Overview of the IR Process

Indexing and Searching Queries models work against the index - Find words, word counts, phrases - Sequential search, indexed search Inverted Files & Other Indices Boolean Queries Sequential Searching Pattern Matching Structural Queries Data structures - The infrastructure of search - Varied per data set and query contexts

Personalized IR system design How would you design a personal IR system? Who would use it? How would you learn about them? - Interests - Sources - Preferences How do you evaluate a personal system? Understanding users is the key to personalizing search or search interfaces.

Information Seeking in Context Learning Information Seeking Information Retrieval Analytical Strategy Browsing Strategy

How do we search? Analytical careful planning recall of query terms iterative query reformulations examination of results batched Browsing heuristic opportunistic recognizing relevant information interactive (as can be)

Behavioral Model Recurring Web behavioral patterns that relate people’s browser actions (Web moves) to their browsing/searching context (Web modes) Modes of scanning: Aguilar (1967) & Weick & Daft (1983, 1984) Moves in information seeking behavior: Ellis (1989) & Ellis et. al. (1993, 1997)

ISeek Behaviors & Web Moves

What do I use? Starting - Bookmarks and groups of bookmarks - Search javascripts Chaining - Tabbed windows - Bookmarking - Printing Browsing and Differentiating - Firefox/Mozilla & recommended links - Blogrolls and PageRank Monitoring - RSS feeds with RSS reader - (Moderated) Listservs Extracting - Saving as HTML, Text, or PDF - Bookmarking & Printing

How do we really use the Web? People don’t read, they scan Web pages We move quickly, we know we can go back Quick experimentation & short memory Behaviors that work are reinforced & continued Satificing makes measures of quality difficult

How do I use the Web? Set of standard, daily Web pages Set of “occasional” Web pages - Fridays - movie reviews, show times, previews - Monthly - stocks and funds Quick focus on a subject, build a set of documents related to that and file for later use I scan quickly down the page and then back up the page Site maps, other links, walk up the URL

Future: Social Issues Who controls the sharing? Who controls the controls? “Give to get” systems Anonymity vs. Community - Community of “friends” - People as data points Free riders Logrolling and Over-rating

Future: Filtering for IR How about filtering, without the collaboration? - Individual preferences - Implicit and Explicit Text is analyzed - Feature extraction - Recall & precision measures New models for multidimensional users/uses/ratings Relevance Feedback - Faster matching, more accurate - Metadata (use data, preferences)

Future: Community Centered CF Forming and keeping community - Interfaces, functionality Helping people find new information - Interactive search - Group browsing Mapping community (prefs?) - Daily News Rating Web pages - Incenting users to share Providing access to stored preferences - Fair, open data collection - Users can tune data

WWW Documents Investigation How do you collect data like this? - Web Crawler URL identifier, link follower - Index-like processing Markup parser, keyword identifier Domain name translation (and caching) How do these facts help with indexing? Have general characteristics changed? (This would be a great project to update.)

Metadata Information that describes a document that is not (necessarily) in the document Describes the document in relation to other documents Context about the Content Document semantics Internally consistent descriptions of content for individual documents, document sets or a specified set of content. For collections or individual documents

Metadata Types Dublin Core elements MARC (machine readable cataloging) - What isn’t machine readable? Semantic Web elements Bottom-up, derived data Format-based - ASCII, EBCDIC - RTF - PostScript PDF - MIME

Digital Libraries We all have them - boxes, archives - Papers written - Bookmarks What I have - 4GB of academic & technical papers Mostly PDF, HTML, text - Indexed using Adobe Catalog, htDig, OS X Search - Data sets from previous studies - Program code - Scanned documents

Big DigLib Questions What’s a document? - A file or link How do you trace & track the information source? - Filenames, memory, metadata How do you integrate the variety of documents & metadata? - Stick to standard formats What kind of storage model? - Version Control system - Server storage - Filenames and directories When do you Index? - Continuously - After a backup Mostly boolean searching with attributes

Course Evaluations (next week) Volunteer to get, distribute, collect and turn-in evaluations Overall level of class expertise relevant for you? Favorite readings – type of readings? Least favorite (obscure – difficult) readings? Project ideas and group organization tools? Assignments: Group Work vs. Papers?