Personal Information Management Vitor R. Carvalho 11-749: Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.

Slides:



Advertisements
Similar presentations
OAF Workshop, May 13-14, 2002, Pisa.CYCLADES IST CYCLADES An Open Collaborative Virtual Archive Environment Umberto Straccia.
Advertisements

Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
IDM 2003 Workshop Stuff I’ve Seen: Susan Dumais Microsoft Research A System for Personal Information Retrieval and.
Features and Uses of a Multilingual Full-Text Electronic Theses and Dissertations (ETDs) System Yin Zhang Kent State University Kyiho Lee, Bumjong You.
Elsweiler, D. and Ruthven, I. and Jones, C. Dealing with fragmented recollection of context in information management. In: Context- Based Information Retrieval.
Discovering Computers: Chapter 1
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
ISP 433/533 Week 2 IR Models.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
1 CS 502: Computing Methods for Digital Libraries Lecture 2 The Nomadic Computing Experiment Object Models.
New “Collaborate” Button Integrate UI directly into the browser. Possible Targets: IE (via SpiceIE) & Firefox (via standard extensions & NPAPI plugins.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Open Statistics: Envisioning a Statistical Knowledge Network Ben Shneiderman Founding Director ( ), Human-Computer Interaction.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
IR in HDMs (Liadh Kelly)- 1 - Centre for Digital Video Processing C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g Venturing into the Labyrinth:
Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Libraries and Institutional Content Management Systems
Working with SharePoint Document Libraries. What are document libraries? Document libraries are collections of files that you can share with team members.
New “Collaborate” Button Integrate UI directly into the browser. Preferred target: Firefox Easiest browser to extend in terms of UI.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
With Windows 7 Comprehensive© 2012 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Windows 7 Comprehensive.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
In addition to Word, Excel, PowerPoint, and Access, Microsoft Office® 2013 includes additional applications, including Outlook, OneNote, and Office Web.
Beyond the Basics Steven Butzel, Nashua Public Library , Yahoo IM: nashuaref.
Open Internet Explorer Go to: my.ccsd.net Type YOUR InterAct username and password. Then Submit Query.
9 Agenda Views Pages Web Parts Navigation Office Wrap-Up.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Presentation by Heather C. Ware. What is Personal Information Management (PIM) Personal Information Management (PIM) refers to both the practice and the.
인지구조기반 마이닝 소프트컴퓨팅 연구실 박사 2 학기 박 한 샘 2006 지식기반시스템 응용.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
OneNote. OneNote? Microsoft OneNote 2010 is a digital notebook that provides a single place where you can gather all of your notes and information, with.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Faceted browsing for ACL Anthology Praveen Bysani.
Knowledge Management Platform Communities of Practice User Guide for CoP users Copyright © 2010 Group Technology Solutions. All Rights Reserved.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto.
Information Retrieval
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Visualization in Text Information Retrieval Ben Houston Exocortex Technologies Zack Jacobson CAC.
Individualized Knowledge Access David Karger Lynn Andrea Stein.
Proposal Nemo Hajiyusuf Ekaterina Mineeva Arpi Shaverdian.
WIRED Future Quick review of Everything What I do when searching, seeking and retrieving Questions? Projects and Courses in the Fall Course Evaluation.
Personal Knowledge Management Let’s increase our blog use PKM Discussion Research Paper drafts due next week.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
KMS & Collaborative Filtering Why CF in KMS? CF is the first type of application to leverage tacit knowledge People-centric view of data Preferences matter.
ASSOCIATIVE BROWSING Evaluating 1 Jin Y. Kim / W. Bruce Croft / David Smith by Simulation.
Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.)
Visual Information Retrieval
Lesson 9 Sharing Documents
OUTLINE Basic ideas of traditional retrieval systems
SIS: A system for Personal Information Retrieval and Re-Use
Search Techniques and Advanced tools for Researchers
Information Retrieval
Thanks to Bill Arms, Marti Hearst
موضوع پروژه : بازیابی اطلاعات Information Retrieval
Magnet & /facet Zheng Liang
Introduction to Information Retrieval
Lecture 8 Information Retrieval Introduction
Recuperação de Informação
Gizem MISIRLI Gülden OLGUN
Presentation transcript:

Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005

Motivation 1 person → several tasks Several contexts Several past activities Several collaborators Several future plans More and more personal information stored Where’s that document ??? Where’s the link to that blue hotel in New York ?

Document Types Some Commercial (Partial) Solutions Web Links Passwords Calendar Text, PDF, ZIP, PS, Latex, RTF, DOC, XML, XLS, PPT, etc IM Audio Video Research: retrieval techniques, prototypes, evaluation, HCI, how users access old documents, visualization, etc.

1 st System: Haystack From 1997-now, MIT. Comprehensive system to personalize IR and relationship between a particular individual and his corpus. Agnostic regarding the particular search tool used. Augment the power of search tools by personalizing and improving the representation of the data recorded. Uses very general data structure. Supports different annotations and different collections. Work with information, not programs. +IM+todoList+calendar+webbrowser+photos+etc together. Indexing is done incrementally. During “calm” periods.

Haystack Architecture

3 ways to harvest data: Data Driven: docs already in Haystack (deletion, selection, etc) or new docs added by user Observers: observing user’s moves (browsing, searching, saving queries, etc) Human annotation: via an special interface Hard to evaluate in large studies You can download the first versions of Haystack from New Eclipse-based Semantic Web Browser (Based on Haystack)

2 nd System: KFTF (Keeping Found Things Found) User study and a survey on how individuals keep and organize info they’ve found on the web. (and want to re-access and reuse it) 24/214 participants: researchers, managers and information professionals. Figure 2: Top 7 keeping methods as ranked by proportion of participants using the method at least once a week

3 rd System: Stuff I’ve Seen (SIS) 2003, Microsoft. Design and evaluation (user study) of a system to “Find things you have seen before” 58-81% of webpages are re-visits. Unix commands, library borrowing, human memory, etc…likewise. Main ideas: 1.Unified index of information across different info sources (calendar, web, , files, etc) 2.Rich contextual cues to trigger memory (author, time, thumbnails, etc). 3.Friendly interface that allows quick feedback and iterative refinement

Stuff I’ve Seen

SIS - Evaluation Supports Boolean as well as best match (Okapi’ probabilistic ranking alg.) retrieval on text and metadata properties. Allows phrases, wildcards and proximity search. 234 people during 6 weeks. Only 7.5% used boolean operators, or phrases in query Queries were short (1.59 words) –- the web ~ 2.35 words Personal datasets from 5K to 100K items Most used filters: file type and date range. Most common query types: People’s names File types opened: s(76%), web(14%), files(14%) Standard ranking functions seem less important in this context

SIS - Evaluation Similar power functions found in webpage re-access and memory re-access Overall, system had a very good acceptance

4 th System: Using Temporal Landmarks 2003, Microsoft. Based on the “Stuff I’ve Seen” system. Synthesis of 2 Ideas: –Epsodic Memory – use landmarks in user’s memory as cues to retrieve information (JFK assassination, 9-11, unforgettable Steelers game, vacations, etc) –Timeline Visualizations – visualize personal dataset in sequential time

Selection of Public Landmarks: priority of important holidays, analysis of news headlines, etc. User driven approach. Selection of Personal Landmarks: different priorities to calendar appointments, “out of office” times, recurrent appointments have low priority, digital photographs (first photo of the day was selected) Temporal Landmarks - Evaluation

Some Questions 1.I was wondering what people think about using a whole personalized web search system with/as a query observer in the haystack system. This might be interesting if the system had access to the haystack internal data and could write back to it. 2.General data model of Haystack approach is quite similar to knowledge map approach. Even though they applied their specific need into these kinds of semantic network, the paper missed semantic network retrieval model. Are there any papers that allow us to retrieval these semantic network?