From Words to Meaning to Insight Copyright Leximancer 2011.

Slides:



Advertisements
Similar presentations
From Words to Meaning to Insight
Advertisements

ELibrary Science Product Demonstration Get ready to experience science in a whole new way –eLibrary Science offers targeted science text and tools.
Critical Reading Strategies: Overview of Research Process
1 All about Content Analysis Minjuan Wang ED 690 EDTEC SDSU.
RESEARCH CLINIC SESSION 1 Committed Officials Pursuing Excellence in Research 27 June 2013.
Teaching Using the Internet in Your Classroom.
Action Research Not traditional educational research often research tests theory not practical Teacher research in classrooms and/or schools/districts.
Academic Writing Writing an Abstract.
From Words to Meaning to Insight Julia Cretchley & Mike Neal.
Qualitative Social Work Research
SOCIAL MEDIA FOR CONSUMER INSIGHT Chapter Chapter Objectives  Describe the types of data used in social media research  Explain the different.
Frank Yu Australian Bureau of Statistics Unstructured Data 1.
9/11/2008 Michelle Warcholic. 9/11/2008
Classroom Assessment A Practical Guide for Educators by Craig A
Web Development & Design Foundations with XHTML
From Words to Meaning to Insight Julia Cretchley & Mike Neal.
Seminar Topic on Content Analysis
The "Big6™" is copyright © (1987) Michael B. Eisenberg and Robert E. Berkowitz. For more information, visit:
Geo-Methods Geography 5161, Spring 2010 Amanda Kass.
BY IMRAN KHAN. A high-level programming language is a programming language with strong abstraction from the details of the.
9. Learning Objectives  How do companies utilize social media research? What are the primary approaches to social media research?  What is the research.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Web Developer & Design Foundations with XHTML
 A set of objectives or student learning outcomes for a course or a set of courses.  Specifies the set of concepts and skills that the student must.
McGraw-Hill © 2006 The McGraw-Hill Companies, Inc. All rights reserved. The Nature of Research Chapter One.
Chapter 9 Qualitative Data Analysis Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 1: Introduction to Statistics
Research Methods in Computer Science Lecture: Quantitative and Qualitative Data Analysis | Department of Science | Interactive Graphics System.
©2010 John Wiley and Sons Chapter 11 Research Methods in Human-Computer Interaction Chapter 11- Analyzing Qualitative.
Content analysis (Holsti)
Dr. Engr. Sami ur Rahman Quantitative and Qualitative Data Analysis Lecture 1: Introduction.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
Computing Fundamentals Module Lesson 19 — Using Technology to Solve Problems Computer Literacy BASICS.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Twelve Content Analysis: Understanding Text and Image in Numbers.
The Literature Search and Background of the Problem.
Thoughts on the Role of Surveys and Qualitative Methods in Evaluating Health IT National Resource Center for HIT 2005 AHRQ Annual Conference for Patient.
Text Feature Extraction. Text Classification Text classification has many applications –Spam detection –Automated tagging of streams of news articles,
Chapter 10 Analyzing Content: Historical, Secondary, and Content Analysis, and Crime Mapping.
How to read a scientific paper
Content Analysis Presented by: Eric S. Riley. What we’re going to cover – Fast…  What is Content Analysis  Rough History of Content Analysis  The Procedure.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
1 CREATING A RESEARCH PAPER (25 June 2010) Objectives: To create a Research Paper using MLA Documentation style.
Chapter Nineteen Understanding Information and e-Business.
SP_IRS Introduction to Research in Special and Inclusive Education(Autumn 2015) Lecture 1: Introduction Lecturer: Mr. S. Kumar.
Computing Fundamentals Module Lesson 6 — Using Technology to Solve Problems Computer Literacy BASICS.
Collecting History: Profiles in Science Alexa T. McCray National Library of Medicine Bethesda, MD Stanford University August 21, 1999.
PSYA4 Research Methods Qualitative Data.
1-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Research Methodology II Term review. Theoretical framework  What is meant by a theory? It is a set of interrelated constructs, definitions and propositions.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Explain How Researchers Use Inductive Content Analysis (Thematic Analysis) on Transcripts.
© 2011 Pearson Education, Inc. All rights reserved. This multimedia product and its contents are protected under copyright law. The following are prohibited.
Undergraduate School of Criminal Justice
Rachael Addicott Centre for Public Services Organisations February 2006 School of Management – Methodology and Qualitative Research Methods ANALYSING QUALITATIVE.
Unit 9– Seminar Analyzing Content: Historical, Secondary, and Content Analysis and Crime Mapping Professor Chris Lim, MA, Ph.D.(ABD)
Analysis of Data Qualitative Data Analysis
QUALITATIVE DATA ANALYSIS WITH ATLAS.ti 8 WINDOWS
Content analysis as a method
Content analysis, thematic analysis and grounded theory
Presented by: Eric S. Riley
Content Analysis What is it? How do you do it? What are the advantages and disadvantages of it?
Evidence in Practice This is a coursework style assessment, that will be formally assessed during your exam. The specification requires that you: Describe.
Dr. Debaleena Chattopadhyay Department of Computer Science
RESEARCH TOOLS OR INSTRUMENTS
Computer Literacy BASICS
Ass. Prof. Dr. Mogeeb Mosleh
Evidence in Practice This is a coursework style assessment, that will be formally assessed during your exam. The specification requires that you: Describe.
Data Analysis, Interpretation, and Presentation
Presentation transcript:

From Words to Meaning to Insight Copyright Leximancer 2011

Consider Some Text "We use the laser 500 printer here at the office. We are pretty happy with it. Once there was a leak and all the toner spilled out of the machine, but a technician came out and fixed the problem for us. We still have to top the toner up often. The printer goes through ink quickly and the cartridges are expensive, but we put up with this because it delivers good results reliably. We are pleased with the quality of rinting we get. The laser 500 can batch process, and collate the pages to save us time. Sometimes paper gets jammed in the laser 500. Then we have to open it up to remove the crumpled pages. We have tried other machines in the past, but have not found an alternative that works better for us.

A Definition C ontent analysis is A formal methodology t o study a collection of media t o discover, uncover, or answer

A Little History Systematic analysis of texts performed several times by religious entities prior to 1900 (Krippendorff, 2004) Major growth periods in the 20th Century (Krippendorff, 2004) Early 20th Century studies of newspaper content Behavior sciences emerge in 1930s and 1940s and begin to study media effects World War II brought about propaganda studies Post war saw expansion into conversation analysis, personal document analysis, processes of communication, and a generalized measure of meaning

A Little (More) History Computer text analysis began in the 1960s but challenging beyond quantitative analysis of text (Krippendorff, 2004) Today, extensive proliferation of traditional, electronic, and social media are leading to strong interest in content analysis and more powerful software

Application Areas Today Conversations Sentiment Social Media Forensics Historical Reviews Political Documents News Analysis Propaganda Television Content Bias Determination Song Lyrics National Security Video Game Content More...

A Definition: Again Content analysis is A formal methodology to study a collection of media to discover, uncover, or answer

A Formal Methodology A formal, objective method with rigor and repeatability Many methods and processes are valid Methodology example 1. Determine research question 2. Identify and collect samples 3. Perform quantitative analysis 4. Perform qualitative analysis 5. Draw conclusions 6. Summarize, publish, and share results

To Study A Collection of Media Media is a method of information communication Collections include the following (normalize formats to text) Written media such as newspapers, magazines, websites, Blogs, Tweets, Facebook pages, s Audio, such as radio programs, interview transcripts, conversations (can be transcribed into text) Video, such as television, movies, news footage, YouTube videos (can be transcribed into text) Images described in text

To Discover, Uncover, Answer... Discover concepts, themes, and relationships in the collection Uncover unknown qualities about the data Answer a specific research question

Key Points Outline All methods of content analysis share common components, which will now be presented Quantitative (counting) and qualitative (meaning) analyses Analysts can use one or both methods A content analysis is best when both quantitative and qualitative approaches are combined (Weber, 1990) More later... Important study aspects include sampling, units of measure, coding, validity, and reliability

Sampling Sampling is a method to take subsets of documents to study Krippendorf (2004) provided this guidance Sampling plans are needed to reduce researcher bias Select a type of sampling (e.g., random) Sample size is important to be representative Split-half technique: Two samples equal the same result

Units of Measure Sampling Part 2: Samples require a definition of data resolution Television comedies, 1/2 hour, Wednesday nights Entire tweet, tweets from a user, collection of topical tweets One blog entry, an entire blog, or consolidation of many blogs Newspaper article, articles of a set timeline Content analysts must determine these units to measure Impacts relationships of words and coding Concept discovery restricted to within units

Coding Process of examining text in a specific unit and extracting relevant data Look for words, phrases, word sense, and categorize units of text (i.e., words, sentences, paragraphs, tweets) Three methods of coding 1. Manual, by person(s) coding from codebooks, instructional guides, intuition 2. Computer-assisted (NVivo) beginning with coding then often some automation for remaining documents 3. Computer generated (Leximancer, CATPAC)

Reliability and Validity For a formal analysis method to be sound, reliability and validity must be addressed Reliability refers to stability and reproducibility Coding to be repeatable if manual or computer assisted Inter-rater reliability for manual coding with multiple coders affects reproducibility and must be ensured Measure of accuracy is tied to statistical norms Accuracy is the strongest form of reliability (Weber, 1990)

Reliability and Validity (cont.) Validity refers to general applicability of results and conclusions obtained from inferences in the study Major concern for qualitative analysis in general Researcher chooses coding concepts --makes inferences Researcher bias, errors, conclusions Neuendorf listed external validity, face validity, criterion validity, content validity, and construct validity Are we measuring what we want to measure? (Neuendorf, 2002, p. 112)

Quantitative Analysis Counting and statistics: Numeric measurements Word frequencies: how many times does a word appear? Specify stop-words to ignore (e.g., the, and, others) Need to consolidate synonyms, stems (e.g., dog = dogs) Compound words (i.e., word pairs) are important United States not good Categories (simply present or frequencies)

Quantitative Analysis cont Concept frequencies How often do concepts occur? Existence (occurs) or actual counts Other Statistics Proximity and co-occurrence frequencies can all be used to determine concept relationships

Qualitative Analysis Coding is performed to reduce text collection to categories (i.e., concepts) Analyst can seed concepts or discover concepts during analysis Often, the more discovery allowed the more objective the analysis (grounded theory reduces researcher bias) Concepts and their relationships form the foundations for extracting meaning Keyword in context (KWIC) Which words and how used (Weber, 1990)

What is a Concept? Synthesis of a text representation Key words, including consolidating synonyms, stems Represents something meaningful Found by examining word, compound word, and surrounding words in a measurable unit Useful to display on a graphical map

A Concept Map

Role of the Computer Solutions A content analysis can be done without a computer. Although... At a minimum, a computer serves as a document file folder and backup device And a search tool for and within documents Software can also assist with manual coding then continue coding automatically (NVivo) Or software can do coding automated by statistical processing (Leximancer) or networks (CATPAC)

Key Points Summary A content analysis is best when both quantitative and qualitative approaches are combined (Weber, 1990). Quantitative analysis counts and finds statistics Qualitative analysis determines meaning Important operational aspects include sampling, units of measure, coding, validity, and reliability

References Krippendorf, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks, CA: Sage. Neuendorf, K. A. (2002). The content analysis guidebook. Thousand Oaks, CA: Sage. Weber, R. P. (1990). Basic content analysis. Newbury Park, CA: Sage. Willig, C. (2008). Introducing qualitative research in psychology: Adventures in theory and method (2nd ed.). Philadelphia, PA: Open University Press.

Reading List Evaluation of Unsupervised Semantic Mapping of Natural Language with Leximancer Concept Mapping, Andrew Smith. Conversations Between Carers and People With Schizophrenia: A Qualitative Analysis Using Leximancer, Julia Cretchley, Cindy Gallois, Helen Chenery, and Andrew Smith Analysis of Asynchronous Discourse in Web-assisted and Web-based Courses, David Thomas and Cleborne Maddux

Reading List cont Computer Aided Phenomenography: The Role of Leximancer Computer Software in Phenomenographic Investigation, Sorrel Penn- Edwards Content Analysis of a Random Day of Two News Sites: FoxNews.com and MSNBC.com, Michael R. Neal