Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4, 2011 1.

Slides:



Advertisements
Similar presentations
What's on My Dashboard Today?: Governmentwide Acquisition Contract Dashboard Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Advertisements

Basic Searching Engineering Village. Agenda What is Engineering Village? Setting up a personal account Searching Engineering Village How to.
A guide to HTML. Slide 1 HTML: Hypertext Markup Language Pull down View, then Source, to see the HTML code. Slide 1.
OMB Data Visualization Tool Requirements Analysis: Information Builders Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Copyright © 2009, Biddle Consulting Group, Inc. 1 Using the Export Wizard Training Presentation Click on the screen or press the right arrow key (  )
OMB Data Visualization Tool Requirements Analysis: Oracle Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Who Tweets the most about Gov20? Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 5,
Wincite Introduces Knowledge Notebooks A new approach to collecting, organizing and distributing internal and external information sources and analysis.
Information Technology in Travel, Hospitality and Tourism
Advanced Searching Engineering Village.
Engineering Village ™ Basic Searching.
Creating Accessible Word Documents by Debbie Lyn Jones, IT Manager I, NSU Webmaster FRIDAY, JANUARY 23, 2015.
OMB Data Visualization Tool Requirements Analysis: Birst Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
The Minority Data Resource Center Felicia LeClere, Ph.D. Director, MDRC.
A Search for Veterans Benefits Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community December 22,
III.Creating Downloadable Files: Word, PDF, Excel and PowerPoint A Web Accessibility Primer: Usability for Everyone Office of Web Communications.
OMB Data Visualization Tool Requirements Analysis: IBM Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Questionnaire Development Part II: SPSS, Reliability, and Validity Personality Lab October 11, 2010.
OMB Data Visualization Tool Requirements Analysis: Logi Analytics Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Slide 1 Today you will: think about criteria for judging a website understand that an effective website will match the needs and interests of users use.
MSDSonline HQ: Viewer Site Tour Main Menu Getting to your Company List Searching within your Company List How to View and Print an MSDS How to Print a.
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using SD K12 SharePoint ®. What is SharePoint? Microsoft SharePoint Components Web Browser Collaboration functions Process management modules Search modules.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Web 2.0 Social Bookmarking and Start Pages in the Classroom Sally Todd, St John’s School Library, April 2009.
1 Build Your Own Data.gov Mashup-of-Mashups Catalog Brand Niemann Senior Enterprise Architect U.S. EPA November 5, 2010.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Harnessing Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL.
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
***Adding items to your Etudes Homepage*** Log into Etudes
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Roles 1. Your Role: End User End Users use Inside NCDOT and Connect NCDOT for basic browsing and reading Typical tasks can include: Open or download files.
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
If you haven’t activated your Edline account contact Ms. Callwood.
DIGITAL MARKETING EXERCISES APPCAMPUS. Click to edit Master title style Exercise – Plan your Keywords Write down keywords for your company or application:
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Podcasting workshop Roni Malek Science Learning Centre London
AMCA Training Contents Module selection Navigation Assignments (Online) Assignments (Upload a file) Forums (online discussions) Wikis Reveals Watching.
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
NYS Performers & Programs Database. Topics Regions Browse Program displays Basic Searching Advanced Search Options.
After this course you will be able to:
Click on SEARCH for catalog
Spotfire 5 Users Guide Dashboard
EUROPEAN STATISTICS ON THE INE WEBSITE
Tutorial 7 – Integrating Access With the Web and With Other Programs
Featuring: Reporting Clusters/Categories
Presentation transcript:

Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4,

Preface For the last 150 years, The New York Times has maintained one of the most authoritative news vocabularies ever developed. In 2009, they began to publish this vocabulary as linked open data. The New York Times also uses approximately 30,000 tags to power their Times Topics Pages. It is their intention to publish all of these tags as linked open data.has maintainedauthoritative news vocabularieslinked open dataTimes Topics Pagestheir intention Today AOL Government publishes both of those together as linked open data in Spotfire so our readers can more readily browse, search, and download these invaluable data sets! 2

data.nytimes.com See next slide People is a 14 MB RDF file! These can be screen scrape into Excel! 3

Build Your Own NYT Linked Data Application March 30, 2010, 1:21 PM Build Your Own NYT Linked Data Application By EVAN SANDHAUSEVAN SANDHAUS – That’s It?: So there you have it — all it takes to build a simple linked data application with New York Times Linked Open Data. But remember: this post just focuses on the highlights. We encourage you to take a closer look at the code and dig into some of the more advanced features we didn’t discuss. We hope that you share our excitement about the possibilities of linked data, and we look forward to seeing what you create! 4

Alumni in the News Opens and Closes Snippet 5

“Who Went Where” Code lines of code! 6

Subject Headings 7 See next slide

Subject Headings See next slide 8

Using Our Linked Data 9

Times Topics The New York Times uses approximately 30,000 tags to power our Times Topics Pages. It is our intention to publish all of these tags as linked open data. 10 See next page

Times Topics 11 See next page

Times Topics 12

Spotfire Describe the chart, how it’s made: – The Spotfire chart was made by screen scraping the NY Times Subject Headings and Topics into an Excel spreadsheet and importing it into Spotfire. The author decided to place the two listings side-by-side as Tufte suggests to facilitate comparisons. The author also decided to re- create the summary table of Subject Heading categories to see how much change had occurred between January 13, 2010, and July 4, 2011 (very little). How it succeeds or falls short – This single Spotfire chart makes the two lists at the NY Times sortable (click on column headers), searchable (use Filters and facets), and downloadable (click on the down arrow in the table header in the Spotfire Web Player). Add any tips for improving: – The NY Times Topics need URLs (25,389) and the author will find a way to automate that task and will soon finish adding the URLs for NY Time Reporters by-hand. 13

Spotfire 14 PC Desktop Spotfire