Analytic Journalism and Decoding the Political Race(s)

Slides:



Advertisements
Similar presentations
Creating Accessible PDF Documents Dick Hemenway CMAC Accessibility Committee.
Advertisements

CSWA Provider: Program and Tech Review
Don’t Type it! OCR it! How to use an online OCR..
Web Center Certification Sitemap / Formatting Content Web Center Certification Training Intuit Financial Services University.
1. XP 2 * The Web is a collection of files that reside on computers, called Web servers. * Web servers are connected to each other through the Internet.
1 IDX. 2 What you will learn: What IDX is Why its important How to use it Tips and tricks Introduction Q & A.
IQuote User Guide (1.2) Use your portal username and password
PART IV - EMBED VIDEO, AUDIO, AND DOCUMENTS. Find a video on Youtube.com: Search for a video, then look for the Embed code. Copy this code into the HTML/JavaScript.
Manuscript Central Training Author Center Module 2.
Slide 1 FastFacts Feature Presentation August 28, 2008 We are using audio during this session, so please dial in to our conference line… Phone number:
Step 1 Start your web browser (Internet Explorer or Firefox). Step 2 Type: in the Address box Step 3 Press Enter on the keyboard.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Wikispaces 101 Training Standards & Interoperability (S&I) Framework May 30, :00 - 5:00pm EDT 1.
Sean Keegan August 5, 2008 For audio call Toll Free and use PIN/code The ABCs of PDFs Part 3: Creating Accessible PDF Documents.
XP New Perspectives on Microsoft Office Word 2003 Tutorial 7 1 Microsoft Office Word 2003 Tutorial 7 – Collaborating With Others and Creating Web Pages.
Office 2003 Post-Advanced Concepts and Techniques M i c r o s o f t Excel PivotTable List Feature Creating a PivotTable List Web Page Using Excel.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Introduction to HTML, XHTML, and CSS
1 Advanced Tools for Account Searches and Portfolios Dawn Gamache Cindy Bylander.
40 Tips Leveraging the New APICS.org to the Benefit of Your Organization, Members, and Customers! 1.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Microsoft Office 2010 Basics and the Internet
Creating an online advent calendar Nairn Computing Science Department Happy Holidays.
1 2 In a computer system, a file is a collection of information with a single name, such as addresses.doc, or filebackup.ppt, or ftwr.exe, or guidebook.xls.
How To Use Google Forms to Create A Test Quick Easy Self-Graded!! Instant Reports.
School of Geography FACULTY OF ENVIRONMENT Working with Tables 1.
1 Web-Enabled Decision Support Systems Access Introduction: Touring Access Prof. Name Position (123) University Name.
1 Lesson 10 Working with Tables Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
Creating Tables in a Web Site
Vanderbilt Business Objects Users Group 1 Reporting Techniques & Formatting Beginning & Advanced.
XP New Perspectives on Introducing Microsoft Office 2003 Tutorial 1 1 Using Common Features of Microsoft Office 2003 Tutorial 1.
Office Links - Sharing Data in Microsoft Office A Mixed Bag of Treasures Chester N. Barkan Registrar Long Island University, C.W.Post Campus.
Microsoft Office Illustrated Fundamentals Unit C: Getting Started with Unit C: Getting Started with Microsoft Office 2010 Microsoft Office 2010.
Benchmark Series Microsoft Excel 2013 Level 2
Created 09/01/2006 Revised 6/1/2010 Office of Information, Technology and Accountability 1 Microsoft Access – Intermediate Level.
1 7912G IP PHONE LCD SCREEN TYPE OF PHONE SOFT KEYS NAVIGATION BUTTON FOOT STAND HAND SET VOLUME BUTTON  MENU  HOLD DIAL PAD.
Chapter 5 Microsoft Excel 2007 Window
Services Course Windows Live SkyDrive Participant Guide.
How to create a wiki using pbwiki. Step 1: Choose a wiki website I have chosen.
USING WORDPRESS. WEEK 1 1.Why WP? 2.Setting Up WP 3.Exploring the Admin screen 4.Page Organization 5.Posting 6.Polls.
Getting Familiar with Web Pages 1 2 The Internet Worldwide collection of interconnected computer networks that enables businesses, organizations, governments,
Presented by Douglas Greer Creating and Maintaining Business Objects Universes.
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
Pasewark & Pasewark Microsoft Office XP: Introductory Course 1 INTRODUCTORY MICROSOFT WORD Lesson 8 – Increasing Efficiency Using Word.
25 seconds left…...
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
School Census Summer 2011 Headlines Version Jim Haywood Product Manager for Statutory Returns.
Graphing AWR Data in Excel
1 NCDesk % of the test will be Telecommunication/Internet Questions.
1 Wiki Tutorial. 2 Outline of Wiki Tutorial 1) Welcome and Introductions 2) What is a wiki, and why is it useful for our work in moving forward the program.
1 Teaching the Web in Under an Hour Mary Ellen Bates Bates Information Services
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fluency with Information Technology Third Edition by Lawrence Snyder Chapter.
RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.
Excel Lesson 17 Importing and Exporting Data Microsoft Office 2010 Advanced Cable / Morrison 1.
© Paradigm Publishing, Inc Access 2010 Level 2 Unit 2Advanced Reports, Access Tools, and Customizing Access Chapter 8Integrating Access Data.
Benchmark Series Microsoft Excel 2013 Level 2
Use the buttons on the top to navigate through the presentation 1 PrevNext Menu.
© Paradigm Publishing, Inc Excel 2013 Level 2 Unit 2Managing and Integrating Data and the Excel Environment Chapter 6Protecting and Sharing Workbooks.
Contract Audit Follow-Up (CAFU) 3.5 Pre-Defined & Ad hoc Reports November 2009 ITCSO Training Academy.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data.
Html: getting started HTML is hyper text markup language. It is what web browsers look at on the Internet. HTML documents should be created in a simple.
Online Collaboration Applications ADE100- Computer Literacy Lecture 28.
Computing in the Modern World
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
1 Computing for Todays Lecture 22 Yumei Huo Fall 2006.
Google Confidential and Proprietary 1 Intro to Docs Google Apps Apps.
Pasewark & Pasewark 1 Access Lesson 6 Integrating Access Microsoft Office 2007: Introductory.
QR Codes: So Last Year? No Way! Tots & Technology 2015.
Politics and Web Strategy: Metrics of Success Sponsored by Knight Digital Media Center April 24, 2008 Karen A.B. Jagoda President E-Voter Institute.
Presentation transcript:

Analytic Journalism and Decoding the Political Race(s) "Newspapers are a 'morning line' tip sheet. 2nd Session Wed, 8 Oct. 2008 “Tracking the data's flow upstream. And then landing it.” Tom Johnson Institute for Analytic Journalism Santa Fe, New Mexico “Three Tuesdays @ SF Complex” Tuesday, 30 Sept. 2008

Objectives: 1st Tuesday, 30 Sept. 2008 "Newspapers are a 'morning line' tip sheet.” 2nd Tuesday, 7 Oct. 2008 "How to track the data's flow upstream.“ Who’s behind those sites, anyway? What’s in online data bases and how did that data get there? How good, i.e. clean, is the data? AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Did anyone ??? What learned? “Interrogate” the sub-pages, their links and the data in those links. List 5 story ideas - based on the data – that could be done Find 5 “vested interested” sites that have data pertaining to political campaigns. AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Candidates Use the Internet Web site Fund raising Email Online ads Webcasts of events Blogs and podcasts Take polls and surveys Communicate with press Television ads on the official site Campaign web video on other sites Participate in social networking sites Opposition research Manage field operations Text messaging Source: Politics and Web Strategy: Metrics of Success Sponsored by Knight Digital Media Center April 24, 2008 Karen A.B. Jagoda, President E-Voter Institute http://e-voterinstitute.com Digital Politics--Weekly Internet Radio Show http://signonradio.com/programs/digital-politics karen@e-voterinstitute.com AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Voters Use the Internet Find out about all candidates and issues Contribute to candidates Organize for and against candidates and issues Tell their friends/family about political issues Post their own opinions in blogs Post video and audio related to candidates Rate posted videos Create their own sites Use social network sites Source: Politics and Web Strategy: Metrics of Success Sponsored by Knight Digital Media Center April 24, 2008 Karen A.B. Jagoda, President E-Voter Institute http://e-voterinstitute.com Digital Politics--Weekly Internet Radio Show http://signonradio.com/programs/digital-politics karen@e-voterinstitute.com AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

What we’re aiming for today Who/what is behind a web site? How do we get data off the web into analytic tools Why do we care? Efficiency Dispersed national archive The Wayback Machine - www.archive.org Accuracy Who/what is behind a web site? How do we get data off the web into analytic tools Why do we care? Efficiency Dispersed national archive The Wayback Machine - www.archive.org – One of the best ways to keep government at all levels honest is for The People to constantly capture The People’s data, not only in U.S. but globally. Proving the purity of that data can be difficult, but it can be established if there are enough duplicate documents or data sets Accuracy If we try to re-enter the data, each keystroke carries a probability of error. And the end results magnifies just like a carpenter framing a wall: a quarter inch error at one end results in an inch of error every four feet. AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Who/what is behind a web site? DNS (Domain Name System) Translates human-friendly computer hostnames into IP addresses. i.e. www.example.com translates to 208.77.188.166 = IP Address Whois - http://whois.domaintools.com/ Betterwhois www.betterwhois.com/ Domain History View historical whois records http://domain-history.domaintools.com/#nsmessages DNS: The Domain Name System (DNS) is a hierarchical naming system for computers, services, or any resource participating in the Internet. It associates various information with domain names assigned to such participants. Most importantly, it translates humanly meaningful domain names to the numerical (binary) identifiers associated with networking equipment for the purpose of locating and addressing these devices world-wide. An often used analogy to explain the Domain Name System is that it serves as the "phone book" for the Internet by translating human-friendly computer hostnames into IP addresses. For example, www.example.com translates to 208.77.188.166. AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

But can triangulate by using phone numbers or addresses in Google Domain walk-through Search Google for "republican national committee" site:.org http://GOPdomains.notlong.com No free sites to find all domains at address or in block of Ips, but http://www.registrantsearch.com/ But can triangulate by using phone numbers or addresses in Google AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Ink-on-paper (use OCR) Online (copy and paste; insert URL) Data formats Ink-on-paper (use OCR) Online (copy and paste; insert URL) Documents (file type?) Dynamic data bases PDF (export;OCR; c&p; Images (.jpg .gif) [Conversion tool] Audio and video files (text/content analysis) Less advanced tools AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

DB vs. Stats Apps vs. Excel Statistical applications SAS, R, Stata, Mathematica, SPSS Cases or Records Fields or variables Alpha or numeric data Linkage = flatfile or relational Priority language scripts Pros Large data sets Fast sorting/filtering True, fine-tuned statistical methodology AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

DB vs. Stats Apps vs. Excel Database: Cases or Records Fields or variables Alpha or numeric data Linkage = flatfile or relational VBA Scripts Pros Large data sets Fast sorting/filtering AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

DB vs. Stats Apps vs. Excel Rows = records or cases Columns = variables or fields Alpha or numeric Linkage to workbooks/worksheets VBA Scripts Pros Counting money/cases “Flow through” calculation Richer calculations/statistical Easy – if crude – graphics Pros Counting money/cases “Flow through” calculation Richer calculations/statistical Easy – if crude – graphics AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Google’s Advance Search “file type” Finding data Google’s Advance Search “file type” Google cheat sheet http://www.google.com/help/cheatsheet.html Bookmarks But what is seen is not the digital reality AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Getting data off web, into Excel No silver bullet! Try copying URL and use it to open Excel Look at the page “source.” If you can see the actual data, good odds of c&p table Use FireFox extensions and “right click” selections In Excel, “Paste” and “Paste Special” Consider “cell formats” AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

A few Spreadsheet Commandments NEVER do analysis on original file. After importing data, enter row or cell or “Comment” with Source, its URL & date Save this file in folder with Name-the-Virgin file name Re-save the file with a track-able file name, i.e. “OpenSource Casino v1 – Oct 8 08” Periodically re-save with adjusted file name. (Why you want to keep a log book so you will know how far back you need to go.) AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Piggy-backing on Bill Dedman Obama leads the money chase in key counties Challenge in getting data out of sub-tables. Look at “Source” for tips. Use Ffox to highlight & copy (table?) Might have to try different approaches Scan whole web page for happy surprises. AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

“Copy” data and paste into MS Word Turn on Word “show/hide¶” Different approaches??? “Copy” data and paste into MS Word Turn on Word “show/hide¶” Look for “delimiting” or “field separator” characters or symbols. Use those to parse the “text to columns” in Excel. Save Word file as “text” (.txt) AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Digital Shadow Puppets Basic, clean data and code Source: www.electionstudies.org “Liberal-Conservative Self-Identification 1972-2004” www.electionstudies.org/nesguide/toptable/tab3_1.htm Basic, clean data and code Source: www.electionstudies.org Liberal-Conservative Self-Identification 1972-2004 Drop down “Source” Look at code under the hood. Explain how the numbers got there. Scroll down to note Question Text Note graphs and sub-set of breakdowns. This is a VERY helpful database/site. Link to the ASCII text version of this table   AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Links with contribution data Opensecrets.org www.opensecrets.org/index.php Copy and paste into Excel Opensecrets.org www.opensecrets.org/index.php Note URL and the “php” extension Original it supposedly meant Personal Home Page. It is an open source, server-side, HTML embedded scripting language used to create dynamic Web pages. In an HTML document, PHP script (similar syntax to that of Perl or C ) is enclosed within special PHP tags. Because PHP is embedded within tags, the author can jump between HTML and PHP (similar to ASP and Cold Fusion) instead of having to rely on heavy amounts of code to output HTML. And, because PHP is executed on the server, the client cannot view the PHP code. Demo SOURCE code AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Get latest version of Acrobat Reader PDF apps alternatives Getting data out of PDF Get latest version of Acrobat Reader PDF apps alternatives Acrobat Pro – pricey but evolving (Buy one-back version, i.e. 8.0 @$147) ABBYY PDF Transformer 2.0 AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Getting data out of PDF Aetna Report http://www.aetna.com/about/aoti/aetna_pac/2007annualreport.pdf Page 4 Copy and paste into Excel Check Cell Format types Common Cause NEW MEXICO: THE CAMPAIGN CONTRIBUTIONS AND LOBBYING EXPENDITURES OF THE TOBACCO INDUSTRY AND ITS ALLIES P. 16 Common Cause NEW MEXICO: THE CAMPAIGN CONTRIBUTIONS AND LOBBYING EXPENDITURES OF THE TOBACCO INDUSTRY AND ITS ALLIES P. 16 C&P top table What are you going to do first? Embed source info, save “Name the Virgin” file Rename with new version file name Demo Data-to-Text Caution on being sure to make sure the table you see in Excel is the same as in the PDF file, I.e. perhaps handy to print out this table first AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Browser’s PDF might not be PDF GovernmentAttic.org Office of the Governor accessions into the Alaska State Archives and Records Dispositions 01-January-2007 to September 2008 Browser’s PDF might not be PDF GovernmentAttic.org Office of the Governor accessions into the Alaska State Archives and Records Dispositions 01-January-2007 to September 2008 Demo that browser can “read” PDF file, but copying from what’s on THIS screen doesn’t plug into EXCEL. Need to SAVE the file, then use ACROBAT to OPEN file. Then copy and paste into EXCEL AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Review Data In  Analysis  Info Out Need for new tools Think about process & forms of data plus methods Need for new tools Document transformation tools Objective: get to ASCII or as close to it as possible Search for “PDF to Text” and “html to text” “pdf to excel converter” See PDF Online and Easy Converter Desktop Acrobat Professional Can “export” PDF to different formats OCR = Optical Character Recognition AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

For next week – 14 Oct. Pull down some data that interests you Get it into a spreadsheet Add a new column or row that gives you information about the topic you didn’t know before AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008

Objectives: 1st Tuesday, 30 Sept. 2008 "Newspapers are a 'morning line' tip sheet.” 2nd Tuesday, 7 Oct. 2008 "How to track the data's flow upstream.“ 3nd Tuesday, 7 Oct. 2008 “How to make sense of those numbers?“ More on getting the data off the web PDFs, CSV & importing and exporting Simple arithmetic and simple spreadsheet tips AJ and Decoding the Political Race(s) © J.T.Johnson 2008_________________________Fall 2008