Document (Text) Visualization Mao Lin Huang. Paper Outline Introduction Visualizing text Visualization transformations: from text to pictures Examples.

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Dynamic Queries for Visual Information Seeking Ben Shneiderman Jin Tong Hyunmo Kang Cmsc838 Sep. 28, 1999.
Image Information Retrieval Shaw-Ming Yang IST 497E 12/05/02.
Cognitive Issues in Virtual Reality Wickens, C.D., and Baker, P., Cognitive issues in virtual environments, in Virtual Environments and Advanced Interface.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus + Context Visualization for Tabular Information R. Rao and S. K.
CS 5764 Information Visualization Dr. Chris North.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Wise, Thomas, Pennock, Lantrip, Pottier, Schur, and Crow
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
Pad++ A Zooming Graphical Sketchpad for Exploring Alternative Interface Physics Benjamin B. Bederson, James D. Hollan, Ken Perlin, Jonathan Meyer, David.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Term 2, 2011 Week 1. CONTENTS Types and purposes of graphic representations Spreadsheet software – Producing graphs from numerical data Mathematical functions.
Fundamentals of Information Systems, Fifth Edition
Visual User Interfaces David Rashty. “Grasping the whole is a gigantic theme. Arguably, intellectual history’s most important. Ant-vision is humanity’s.
JASS 2005 Next-Generation User-Centered Information Management Information visualization Alexander S. Babaev Faculty of Applied Mathematics.
Project Builder and MediaMatrix: Redefining Access in the Digital Age Dean Rehberger and Michael Fegan MERLOT August 7-10, 2006 New Orleans, LA.
Defining Text Mining Preprocessing Transforming unstructured data stored in document collections into a more explicitly structured intermediate format.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
Advanced Scientific Visualization
Search Engine Architecture
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
V Material obtained from summer workshop in Guildford County, July-2014.
Recuperação de Informação B Cap. 10: User Interfaces and Visualization , , 10.9 November 29, 1999.
Copyright © 2005, Pearson Education, Inc. Slides from resources for: Designing the User Interface 4th Edition by Ben Shneiderman & Catherine Plaisant Slides.
Chapter 10 Interacting with Visualization 박기남
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
Document Lens 3D Visualization Tool For Large Rectangular Presentations.
1 Presentation Methodology Summary B. Golden. 2 Introduction Why use visualizations?  To facilitate user comprehension  To convey complexity and intricacy.
GALAXIES/THEMESCAPES JAMES WISE, JAMES THOMAS, KELLY PENNOCK, DAVID LANTRIP, MARK POTTIER, ANNE SCHUR, VERN CROW -MAULIK SHUKLA.
Presentation on Database management Submitted To: Prof: Rutvi Sarang Submitted By: Dharmishtha A. Baria Roll:No:1(sem-3)
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
1 Dimensions / Depth James Slack CPSC 533C February 10, 2003.
Visual Information Retrieval
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Advanced Scientific Visualization
Search Engine Architecture
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Proceedings of Infoviz’95
Visualization of Web Search Results in 3D
Data Warehousing and Data Mining
CSc4730/6730 Scientific Visualization
Dynamic Queries for Visual Information Seeking Ben Shneiderman
Introduction to Visual Analytics
Introduction to Data Structure
Search Engine Architecture
Tutorial 7 – Integrating Access With the Web and With Other Programs
Presentation transcript:

Document (Text) Visualization Mao Lin Huang

Paper Outline Introduction Visualizing text Visualization transformations: from text to pictures Examples from the MVAB Project Conclusions and directions for future research and development

Introduction Current Visualization approaches – For visualizing mostly structured and/or hierarchical information Some research in information retrieval – Utilized graph theory or figural display – Information returned is documents in text form Users still have to read Causes a severe upper limit Open Source digital information – Available text overwhelms the traditional reading methods of inspection, sift and synthesis

Visualizing text True text visualizations – Must represent textual content and meaning without the user having to read it – Result from content abstraction and spatialization of the text document Use primarily preattentive, parallel processing powers of visual perception Goal is to spatially transform text information into a new visual representation

Visualization transformations: from text to pictures Four important technical considerations – Clear definition of text what comprises text how it can be distinguished from other symbolic representations – Way to transform raw text into a different visual form – Foundation for meaningful visualization Suitable mathematical procedures and analytical measures – A database management system

Processing Text Requirements of text processing engine – Identification and extraction of text features Frequency-based measures on words Higher order statistics taken on the words Semantic in nature – Efficient and flexible representation of documents in terms of these text features – Support for information retrieval and visualization Pre-process, indexing

Visualizing output from text processing Representing the document – a vector in high dimensional feature space Comparisons, filters, and transformations can be applied Clustering using the normalized document vectors Projection – Principal Components Analysis – Multi-Dimensional Scaling – Exponential order of complexity Clustering in the high-dimensional feature space Visualize the cluster centroids

Managing the representation Two basic classes of data – Raw text files Static in nature, Simple in structure Easy to manage – Visual forms of the text Extensive and dynamic Object-Oriented Database – Flexibility of data representation – Power of inheritance – Ease of data access

Interface design for text visualization Backdrop – Central display resource Workshop – Grid having resizable windows to hold multiple views Chronicle – Area where views are placed and linked to form a visual story

Examples from the MVAB Project MVAB – Multidimensional Visualization and Advanced Browsing Project – Visualization and analysis of textual information – Showcased in SPIRE SPIRE – Spatial Paradigm for Information Retrieval and Exploration Starfields and Topographical maps metaphors – Galaxies and Themescapes

Galaxies Displays cluster and document interrelatedness 2D scatterplot of ‘ docupoints ’ Simple point and click exploration Sophisticated tools – Facilitate more in-depth analysis – Ex) temporal slicer

Galaxies Screen Shot

ThemeScapes Abstract, 3D landscapes of information Convey relevant information about topic or themes without the cognitive load Spatial relationships reveal the intricate interconnection of thems

ThemScapes - Advantages Displays much of the complex content of the document database Utilizes innate human abilities for pattern recognition and spatial reasoning Communicative invariance across levels of textual scale Promote analysis

ThemeScapes Screen Shot

Conclusions Text visualizations can overcome much of the user limitations – Enhanced insight and time savings (35 mins vs 2 weeks) – Creative with the tool Querying and analytical manipulation come together in a single visualization – Permits a different kinds of querying Text visualizations will have to access and utilize the cognitive and visual processes

Directions for Future R & D Visual Data Analysis Elaborate the visual metaphors Addition of sensory modalities – Virtual interaction

My Favorite Sentence The bottleneck in the human processing and understanding of information in large amounts of text can be overcome if the text is specialized in a manner that takes advantage of common powers of perception.

Contributions Explorations of new visualizations Discussion of the process for mapping Raw Data Document collections into visualizations

Notes on the Reference Designing Interaction: Psychology at the Human Computer Interaction Interfaces Issues and Interaction Strategies for Information Retrieval Systems Clustering and Dimensionality Reduction in SPIRE

Critique – Strengths and Weaknesses Strengths – Provide natural visual metaphors – Enable the users to see the relationships between documents with minimal required reading Weaknesses – No validation of some conclusions

What has happened to this topic? 1996 R&D 100 Award OCSB – On-line Citation Searching and Browsing in UMD "ThemeScape" is now a trademarked term of Cartia, Inc. WebThemeTM – an interactive tool that provides a visual display of the common themes in collections of web-based documents

WebTheme Screen Shot

Document Lens Why: -Text too small to read but yet needed to perceive patterns. - Perspective wall wastes corner areas of screen What: General visualization technique based on a common strategy for understanding paper documents when their structure is not known. How: 3D Visualization Tool For Large Rectangular Presentations

Document Lens Features Lens – rectangular – interested in text that is mostly rectangular Sides are elastic and pull the surrounding parts towards the lens creating a pyramid

Document Lens