Presentation at Society of The Query conference, Amsterdam November 13-14, 2009 (original title: Learning from Google: software design as a methodology.

Slides:



Advertisements
Similar presentations
Presentation at Nowcasting Symposium, Design/Media Arts Department, UCLA, Los Angeles, October 16-17, 2009 (original title: Cultural Analytics annual report.
Advertisements

Collections Management Software for Museums and Archives r e d i s c o v e r y s o f t w a r e. c o m O V E R V I E W P R E S E N T A T I O N.
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Technology Roadmap Project Harold Flescher VP-Elect, Technical Activities August 2008, Region 1 Meeting.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
PhD Success in Qualitative Research Sten Ludvigsen InterMedia University of Oslo.
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Information Retrieval in Practice
Search Engines and Information Retrieval
Managing Data Resources
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
INFO 624 Week 3 Retrieval System Evaluation
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Information Architecture Donna Maurer Usability Specialist.
Overview of Search Engines
EIA : “Automated Understanding of Captured Experience” Georgia Institute of Technology, College of Computing Investigators: Irfan Essa, G. Abowd,
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Data Mining Techniques
RSBM Business School Research in the real world: the users dilemma Dr Gill Green.
*Chapter One: What is Footnote?* Footnote allows people to find and share over 70 million historical documents Use the search engine to explore documents.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
In this presentation we will elaborate more on the importance of Choropleth Maps, Group Layers, Scales, Attribute Classification, Definition Queries, Hyperlinks,
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Search Engines and Information Retrieval Chapter 1.
Multimedia Databases (MMDB)
Accessing the Deep Web Bin He IBM Almaden Research Center in San Jose, CA Mitesh Patel Microsoft Corporation Zhen Zhang computer science at the University.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
MULTIMEDIA DEFINITION OF MULTIMEDIA
Audio and Video Chris McConnell Department of Radio-TV-Film November 30, 2006.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
CS315-Web Search & Data Mining. A Semester in 50 minutes or less The Web History Key technologies and developments Its future Information Retrieval (IR)
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
CS3041 – Final week Today: Searching and Visualization Friday: Software tools –Study guide distributed (in class only) Monday: Social Imps –Study guide.
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Discuss how researchers analyze data obtained in observational research.
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
Research Methodology II Term review. Theoretical framework  What is meant by a theory? It is a set of interrelated constructs, definitions and propositions.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services
DATA VISUALIZATION BOB MARSHALL, MD MPH MISM FAAFP FACULTY, DOD CLINICAL INFORMATICS FELLOWSHIP.
Data Management: Data Analysis Types of Data Analysis at USGS There are several ways to classify Data Analysis activities at USGS, and here are some of.
Information Retrieval in Practice
Map Reduce.
Taxonomies, Lexicons and Organizing Knowledge
Data Warehousing and Data Mining
Mapping - Linking - Planning - Documenting
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
MIS2502: Data Analytics Clustering and Segmentation
Algorithms and Problem Solving
CHAPTER 7: Information Visualization
Ying Dai Faculty of software and information science,
Chapter 10 Content Analysis
Presentation transcript:

presentation at Society of The Query conference, Amsterdam November 13-14, 2009 (original title: Learning from Google: software design as a methodology for cultural analysis) Dr. Lev Manovich Director, Software Studies Initiative, Calit2 + UCSD Professor, Visual Arts Department Follow our research: softwarestudies.com Learning from software

we will compare common methods of cultural analysis in humanities and principles behind Google search, Google Earth, Google Analytics, and Google Trends

1| data size cultural analysis: very small samples of cultural production search engine: every accessible web document (and now also twitter and facebook)

2| coverage cultural analysis: highly uneven coverage (some areas are covered in much higher resolution than others Google search technology/ Google Earth: aiming for even coverage of all territory at the same level of detail (space in Google Earth, web in Google search)

3| zoomability cultural analysis: document - creator - group - period - paradigm Google search technology/Google Trends: a single interaction/page - search patterns of billions of people over a number of years Google Earth: street view - Earth view

4|categorization cultural analysis: cultural objects are placed into small number of genres/categories search engine: analysis of each web document to generate its unique description (using 200+ signals) (while significant research in automatic classification of web pages into genres exists, Google does not use it)

5| links cultural criticism: analysis of small number of selective links (influences) between a given object/person and others search engine: systematic consideration of all (explicitly defined) links between a given web page and other pages

6|features (characteristics, attributes, dimensions) cultural analysis: small number of subjectively selected features diff. from text to text search engine: lots of features (always the same) Examples: Google: PageRank [considers] more than 500 million variables and 2 billion terms. Our technology analyzes the full content of a page and factors in fonts, subdivisions and the precise location of each word. Sense Networks attributes 487,500 dimensions to every place in a city.

7|interaction cultural analysis: theoretical work on reception -but in practice analysis of documents as experienced by a critic. search engine: analysis of documents, links, search engine use web analytics: analysis of user interactions with a web site

summary: software developed in digital culture industry and academy (as exemplified by applications/services used above) often contains innovative theoretical ideas about culture embedded in its design (i.e. the steps taken by software to calculate the results). However, the applications of such software are often less innovative than the steps themselves.

example of standard applications: - search: looking for a particular members of a set. - classification of cultural content into small number of genres.

example of a new application which uses some of the same steps : 1) extract features from each document in a set; 2) instead of using the features to classify documents into a few classes, visualize the patterns and variability across the set

Map of Science visualization of scientific paradigms - can we create similar maps of of cultural fields which show hundreds or thousands of clusters - instead of dividing everything into a few genres?

lets take selected principles from search engines (and data analysis in general) + web analytics and Google Trends (interactive visualization of patterns) + Google Earth (continuous zoom and navigation) + manyeyes (visualization, sharing of data and analysis) and imbed them in new software tools for researching, teaching and exhibition of culture we can call research which uses these principles and software tools cultural analytics

goals of cultural analytics: - being able to better represent the complexity, diversity, variability, and uniqueness of cultural processes and artifacts - develop techniques to describe the dimensions of cultural artifacts and cultural processes which until now received little or no attention (such as gradual temporal change) - create much more inclusive cultural histories and analysis - ideally taking into account all available cultural objects created in particular cultural area and time period (art history without names) - democratize cultural research by creating open-source tools for cultural analysis and visualization - create interfaces for exploration of cultural data which operate across multiple scales - from details of structure of a particular individual cultural artifact/processes to massive cultural data sets/flows

cultural analytics - typical steps: -1) description (i.e, culture into data): a) manual: annotation, tagging b) automatic: software analysis of media; capturing user activity our focus: easy-to-use techniques for automatic description of visual and interactive media 2) optional: statistical data analysis 3) data visualization (reduction, summarization) and data mapping (expansion, outlining, layering) our focus: new visualization + mapping techniques appropriate for interactive exploration of large sets of visual objects 4) interpretation (humanities), or explanation (science), or correlation (social science)

visualization of cultural data - visualization types: 1) visualization without doing additional annotation / automatic analysis a) display all objects in a set together organized by exiting metadata (for instance, dates, artist names, etc.) b) sample and re-order (for instance: montage, slice) 2) visualization after doing additional data analysis/annotation c) visualization of newly generated metadata (graph) d) display objects organized by metadata (image graph) d1) 2D sorted view d2) 2D image graph - using single feature for each dimension d3) 2D image graph - using combination of features (PCA, etc.)

theoretical issues: Can analysis of a cultural artifact/user experience in terms of separate features still capture overall gestalt? Culture does not equate cultural artifacts. How can we automatically analyze context in a meaningful way? If cultural process/activity is more important than the outputs being produced, how to conceptualize and visualize this? Statistical paradigm (using a sample) vs. data mining paradigm (analyzing the complete population). Modernity/normal distribution vs. Software Society/power law. Pattern as a new epistemological object. From meaning to pattern: humanities have been focused on interpreting the meanings of a cultural artifact/process. Today we can easily uncover the meanings of each cultural artifact - but we dont know the larger patterns they form. The new scale of culture points toward a pattern as a new unit of analysis (because we can not afford to consider meanings of every single artifact.) From small number of genres to multi-dimensional space of many features where we can look for clusters and patterns. This maybe the only way to contemporary analyze design and media created with software. (See next slide).