President's Chief Data Scientist: DJ Patil and HHS IDEA LAB Demand- Driven Open Data: David Portnoy Dr. Brand Niemann Director and Senior Data Scientist/Data.

Slides:



Advertisements
Similar presentations
Employer Mentoring at Edinburgh Napier University Claire Bee Towards a Confident Future.
Advertisements

KEYS TO A SUCCESSFUL JOB SEARCH NWTC Career Services April 23,
PRODUCT FOCUS 5/27/14 – 6/6/14 INTRODUCTION Our Product Focus for the next two weeks is CompTIA. CompTIA is most well known for serving as the backbone.
How to gain traffic and exposure using LinkedIn. LinkedIn is first a networking tool. The principle of networking is to give without expecting something.
Data Science for NSF Polar Cyberinfrastructure & MIT Big Data Course Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
Data Science for Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
[ Date ] [insert your community foundation logo here] Give Local America A giving day opportunity.
USDA-RD PRESENTATION TO NACO WESTERN INTERSTATE REGION ANCHORAGE, AK MAY 22, 2014 White House Rural Council “Made in Rural America”
Sage Insights 2015 Using the mobile and social benefits of Sage CRM to enhance your business. Ocean Helberg. Senior CRM Consultant.
WELCOME TO THE AHIA CONNECTED COMMUNITY! HEALTHCARE INTERNAL AUDIT'S PROFESSIONAL THOUGHT LEADERSHIP COMMUNITY.
Lessons from Inside the Obama Analytics Cave: Targeted Marketing, Ad Testing and Digital Strategies Andrew Claster Former Deputy Chief Analytics Officer.
“Consistency is Key!” A Quick Guide to Online Marketing By Virtual Marketing Empire, LLC
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Navigating the Maze How to sell to the public sector Adrian Farley Chief Deputy CIO State of California
1 Semantic Cloud Computing & Open Linked Data Pattern Brand Niemann Invited Expert to the NCIOC SCOPE and Services WGs September 22, 2009.
Transforming Data-Driven Publications and Decision Support Joan L. Aron, Ph.D. Consultant Federal Big Data Working Group COM.BigData 2014.
September 30th We would like to welcome you all to our classroom! Even though we started off with an unexpected situation with the creation of a second.
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
Notes to Presenter: This presentation has many slides that explain TAB. Most of the slides can be presented quickly. However, you may want to retain only.
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Government of CanadaGouvernement du Canada Service Transformation through Government On-Line Helen McDonald Director General, Office of the Chief Information.
Partnerships and collaboration Working together: good for business, good for research I work for business.gov.au but also thought it would be a good opportunity.
International Congress and Convention Associationwww.iccaworld.com Strategic Plan – Mission Statement “ICCA is the global community for the meetings industry,
Using LinkedIn to Build Business Presented by: Mandy Boyle SEO Manager.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data! Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Surveying patrons with the Impact Survey A fast, easy way to gather feedback from the community about public technology needs Samantha Becker, MLIS, MPA.
We’ve Developed Insights. Now, how do we commercialize them across the organization and retail customers with speed? Objective: Share how Georgia-Pacific.
2013 AIM Hospital Marketing Conference How to Build Physician Leaders AIM Annual Conference April 13, :00 – 1:45 p.m. BUILDING PHYSICIAN LEADERSHIP:
Data Science for International Data Week 2016: Concept Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science.
Call to Community: Building Connections that Make a Difference for Students with Disabilities CA Community Meeting April 28, 2008.
Data Science for HealthData.gov Developers & Family Caregivers Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
With libraries, registration & archives An introduction to our services and the way we do things Cath Anley – April 2012 connect.
Commissioning Self Analysis and Planning Exercise activity sheets.
JFK-103B1W9 and JFK-103B3W9 This program is going to be used to learn about:  Decision Making Skills  Communication Skills  Team Building Skills and.
Class Directors and Committee Chairs. Both Class Directors and Committee Chairs are “Leaders of Leaders.” Committee chairs and class directors spend their.
© 2013 IBM Corporation CMO and CIO: Friends with digital benefits iStrategy – May 15, 2013 Surjit Chana CMO, IBM
Toronto Financial District BIA Leveraging LinkedIn to Reach Targeted Audiences and Build our Brand.
1 Improved Access to EPA and Interagency Information: Before and After with Web 2.0 – Part 7 EPA Jam on Improved Access to Environmental Information, June.
Government Technology & Innovation Incubator for Big Data Analytics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
ESIP Vision: “Achieve a sustainable world” by Serving as facilitator and advisor for the Earth science information community Promoting efficient flow of.
Transforming Patient Experience: The essential guide
Chapter 6 Finding a Job.
Chaitan Baru Senior Advisor for Data Science CISE Directorate National Science Foundation NIEHS Webinar October 27, 2015 Image Credit: Exploratorium. Integrating.
Yorkshire & Humber Digital Health & Wellbeing Ecosystem - member of the ECHAlliance International Network of Permanent Connected Health Ecosystems #YHDigitalHealthEco.
Government and Industry IT: one vision, one community Vice Chairs April Meeting Agenda Welcome and Introductions GAPs welcome meeting with ACT Board (John.
Driving Innovation with Open Data Chris Musialek in place for Jeanne Holm Data.gov February 9, 2012.
CAREER PATHWAYS THE NEW WAY OF DOING BUSINESS. Agenda for our Discussion Today we’ll discuss: Career Pathways Systems and Programs Where we’ve been and.
1 An Overview of Process and Procedures for Health IT Collaboration GSA Office of Citizen Services and Communications Intergovernmental Solutions Division.
WhoDoYouKnowAt Wharton 1. Luke wants to sell software to MediHearts, a particular medical devices company that could benefit from his product…but.
Leadership Guide for Strategic Information Management Leadership Guide for Strategic Information Management for State DOTs NCHRP Project Information.
GroupRocket.net. Years back checking s in the morning was the first ever thing most of the professionals would start their day with. And with the.
Chapter Program. Casual event, feel free to contribute We are all volunteers, enjoy and step up Bathrooms are located in the hallway Coffee Service is.
The opportunities and challenges of sharing genomics data with the pharmaceutical industry Shahid Hanif, Head of Health Data & Outcomes, ABPI DNA digest.
The Power of Analytics Applying and Implementing Analytics – How to, When to, and Why May 23, 2016 Session 2 Presented by Kelly Jin Citywide Analytics.
Digital transformation, which often includes establishing big data analytics capabilities, poses considerable challenges for traditional manufacturing.
South Big Data Innovation Hub
Webinar Getting The Most Out Of Open Data
First Meetup: Data Science for the Data Act at Treasury
WEBINAR The Rise Of Insights Services
Connecting, Sharing and Learning
Nicole Steen-Dutton, ClickDimensions
Business Modeling Week 5.
President’s Administrative Innovation Fund: Connecting IT Subject Matter Expertise CIO Council Update
SOCIAL MEDIA MARKETING
Research for all Sharing good practice in research management
Social Media Marketing Strategy Template
Wide Ideas Idea Management Software Idea Management Process
Social Media Marketing Strategy Template
Presentation transcript:

President's Chief Data Scientist: DJ Patil and HHS IDEA LAB Demand- Driven Open Data: David Portnoy Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community April 20,

Silicon Valley to Washington Crafting Obama Administration Tech Policy: – Megan Smith, from Google Inc., to be U.S. Chief Technology Officer (CTO) – Alexander Macgillivray, from Twitter Inc., to be Deputy CTO – Tony Scott, from VMware, to U.S. Chief Information Officer (CIO) – Mikey Dickerson, from QSSI, Google, and Obama for America, to U.S. Digital Service Administrator – DJ Patil, from VP of Product at RelateIQ and the Data Scientist in Residence at Greylock Partners, to be Chief Data Scientist – David Portnoy, Aginity LLC, to HHS IDEA Lab Fellow, Datalytx, Inc. and Healthbox 2

Dhanurjay Patil in the News First White House Data Chief Discusses His Top Priorities (March 17, 2015) – DJ Patil talks about how to get more our of public and private information while protecting that data from abuse. On Demand Webinar Featuring New Federal Chief Data Scientist DJ Patil and Hilary Mason (February 24, 2015) – Co-authors of a book you can download for free from O'Reilly called Data Driven. The President Speaks At Hadoop World: Introduces DJ Patil as Nation’s First Chief Data Scientist (February 21, 2015) – For the first time in the history of Hadoop World, the President of the United States gave a keynote. DJ Patil Scores the Sexiest Job in D.C. (February 20, 2015) – He co-wrote a paper that appeared in the October 2012 Harvard Business Review. Unleashing the Power of Data to Serve the American People (February 20, 2015) – “to responsibly source, process, and leverage data in a timely fashion to enable transparency, provide security, and foster innovation for the benefit of the American public, in order to maximize the nation’s return on its investment in data.” The White House Names Dr. DJ Patil as the First U.S. Chief Data Scientist (February 18, 2015) – Data science leadership on the Administration’s momentum on open data and data science. Data Driven: Creating a Data Culture (with Hilary Mason) (January 5, 2015) – Succeeding with data requires real cultural change and building a data culture is the key to success in the 21st century. "Tim O’Reilly: The World’s 7 Most Powerful Data Scientists (November 2, 2011) – DJ Patil and Jeffrey Hammerbacher are numbers 2 and 3 3

U.S. Data Chief Aims to Empower Citizens with Information Previous experience: Salesforce, EBay, Skype, PayPal and LinkedIn. He’s also done work for NOAA to improve weather forecasting and with the Defense Department. His data science team at LinkedIn Corp. — the first of its kind—built the “People You May Know” button that helped launch the social platform by nudging users to link to their professional contacts. MEETUP In Silicon Valley, Mr. Patil is known as an evangelist for all things data The government’s role in making sure data isn’t misused and helping citizens take advantage of massive federal databases. Mr. Patil said he was drawn to Washington by several White House initiatives. In the past two years, the administration established real-time dashboards that measure how multimillion-dollar information technology infrastructure projects are coming along; an online “blue button” that individuals can click to download their health records held in different silos across the government; and a “green button” that allows citizens to download statistics on their energy use. National directory of farmer’s markets to a compendium of consumer complaints However, it remains to be seen whether Mr. Patil will tackle the thorniest issues raised by big data. Privacy advocates have decried the Wild West atmosphere in which companies collect user data without permission, sell it to advertisers and others, and use behavioral science to discern intimate desires and habits. His goal is to design simple, powerful features like a People You May Know button, for government data, he said. After all, not everyone will go to data.gov to download a spreadsheet. “The best data products often don’t show you any data,” he said. “They facilitate an end goal. They help you reach something more efficient.” One of Mr. Patil’s first projects involves tracking the reasons people visit government websites and the steps they take to move through them. He aims to reduce those steps and anticipate people’s needs. Another is an initiative called “Precision Medicine,” announced in the president’s State of the Union address in January. Precision Medicine will involve about a million citizens who agree to participate in a longitudinal health study. Volunteers will wear digital health trackers and have their genes sequenced. Officials will use the resulting data to find patterns between lifestyle factors and genetic predispositions. Mr. Patil hopes the information will offer clues to who is likely to get sick or respond to certain treatments. 4

Back-of-the-Napkin Math, with DJ Patil: #PiDay Challenge “Imagine you have a rope snug all the way around the equator. Now you need to add some rope so that the rope is 2 inches above the ground all the way around. How much rope do you need to add?” The really cool thing about this problem is that it goes counter to our intuition. We think of the Earth’s equator as huge and therefore the amount of rope needed to be added must be large. But let’s do the math. The equator of the Earth is a circle and the circumference is 2 x pi x r (or 2πr). The radius with the height 2 inches off the ground would be r+2 inches, and the circumference would then be 2 x pi x (r+2) (or 2π(r+2)). Let’s subtract these to get how much rope we would need and we get 2 x pi x r - 2 x pi x (r+2). Doing some quick math, you get 2 x pi x (r+2) - 2 x pi x r = 2 x pi x r – 2 x 2 x pi - 2 x pi x r = 4 x pi. There you have it. All you would have to add is 4 x pi inches or approximately inches of rope! Crazy, but true. That’s why you need math! 5

First White House Data Chief Discusses His Top Priorities At the top of my list right now is the Precision Medicine Initiative. Science has enabled us to unlock the human genome. Now we want to combine that with the power of data science, which uses new techniques like machine learning as well as the explosion of data now available about individual patients, whether through their phones or other sensors in their environment. The challenge is putting this together to come up with new ways to think about health care and medical treatments.human genome – Semantic Medline and Natural Medicine for Disease and Wellness Meetup My second priority is opening up more data and making it available for people [both the government and general public] to build an ecosystem of research, mobile apps and visualizations on top of that information. – Semantic Community and Federal Big Data Working Group Meetup The third main priority is inserting more data capacity into agencies throughout the government. We’re seeing a rise of data scientists and chief data officers at the National Institutes of Health as well as within [the Department of] Health and Human Services. The Commerce Department announced its first chief data officer [Ian Kalin] last week. We have to decide how to use the best of what we see in data science and statistics groups throughout the government to develop new services.National Institutes of HealthHealth and Human Servicesannounced its first chief data officer – Federal Big Data Working Group Meetup and Eastern Foundry 6

On the Case at Mount Sinai, It’s Dr. Data 1 Jeffrey Hammerbacher is a number cruncher — a Harvard math major who went from a job as a Wall Street quant to a key role at Facebook to a founder of a successful data start-up. But five years ago, he was given a diagnosis of bipolar disorder, a crisis that fueled in him a fierce curiosity in medicine — about how the body and brain work and why they sometimes fail. The more he read and talked to experts, the more he became convinced that medicine needed people like him: skilled practitioners of data science who could guide scientific discovery and decision-making. 7

On the Case at Mount Sinai, It’s Dr. Data 2 Now Mr. Hammerbacher, 32, is on the faculty of the Icahn School of Medicine at Mount Sinai, despite the fact that he has no academic training in medicine or biology. He is there because the school has begun an ambitious, well-funded initiative to apply data science to medicine. His group’s objective is to alter how doctors treat patients someday. For example, Mount Sinai medical researchers have done promising work on personalized cancer treatments. It involves the genetic sequencing of a patient’s healthy cells and cancer tumor. Once the misbehaving gene cluster is identified and analyzed, it is targeted with tailored therapies, drugs or vaccines that stimulate the body’s defenses. 8

Data Science for Natural Medicines and Epigenetics 9

Natural Medicine for Disease and Wellness Meetup 10

The Birth of Demand-Driven Open Data Previous experience: – Built an online marketplace for medical services called Symbiosis Health. I made use of three datasets across different HHS organizations. – But I did so with great difficulty. Each had deficiencies which I thought should be easy to fix. It might be providing more frequent refreshes, adding a field that enables joins to another dataset, providing a data dictionary or consolidating data sources. If only I could have told someone at HHS what we needed! Project champions: – Keith Tucker and Cynthia Colton, Enterprise Data Inventory (EDI) Leads in the Office of the Chief Information Officer (OCIO). – Damon Davis, Health Data Initiative and HealthData.gov Lead. What is it: – A framework of tools and methods to provide a systematic, ongoing and transparent mechanism for industry and academia to tell HHS what data they need. – Lean Startup approach to open data to minimize up front development, acquiring customers before you build the product. Bigger picture: – HHS’s existing Health Data Initiative (HDI) and HealthData.gov – DDOD to serve as the community section of HealthData.gov. Get involved in two ways: – Get the word out to your network about the opportunities provided by DDOD – Add use cases to

Demand-Driven Open Data DDOD is a mechanism to tell data owners what's most valuable to you: – Demand-Driven Open Data (DDOD) is a framework of tools and methodologies to provide a systematic, ongoing and transparent mechanism for you to tell public data owners what's most valuable. – All work is entered, prioritized, implemented, and validated in the form of "use cases". This approach allows for all projects to have a known value even before work begins. It is the Lean Startup approach to open data initiatives. Use Cases: – Use cases initially get entered and discussed in as Github issues ( and linked to related wiki entrieshttps://github.com/demand-driven-open-data/ddod-intake/issues Specifications: – Detailed specifications for each use case are described in the intake wiki ( and linked to related issue entrieshttps://github.com/demand-driven-open-data/ddod-intake/wiki 12

Key Questions Will Precision Medicine include Natural Medicine for Disease and Wellness? – Precision Medicine: Medical and genomic data provides an incredible opportunity to transition from a “one-size-fits-all” approach to health care towards a truly personalized system, one that takes into account individual differences in people’s genes, environments, and lifestyles in order to optimally prevent and treat disease. We will work through collaborative public and private efforts carried out under the President’s new Precision Medicine Initiative to catalyze a new era of responsible and secure data-based health care. How does Demand-Driven Open Data fit with DJ Patil’s Top Three Priorities? – Usable Data Products: The President’s Executive Order on machine-readable data gives us a tremendous opportunity to productively connect unique data sets. The challenge is that open data is necessary, but not always sufficient, to create value and drive innovation. For example, the binary 0s and 1s that allow a computer to generate an MRI are of little use to a patient — it is the computationally rendered MRI image that communicates the information locked inside of that binary data. We will work to deliver not just raw datasets, but also value- added “data products” that integrate and usefully present information from multiple sources. Who will do the Responsible Data Science? – Responsible Data Science: We will work carefully and thoughtfully to ensure data science policy protects privacy and considers societal, ethical, and moral consequences. Data will continue to transform the way we live and work. 13

EPA Big Data Analytics: Turning Data Into Value In support of CDS DJ Patil, I am developing a Data Science for EPA Big Data Analytics Data Product and Meetup in cooperation with EPA using EPA Ecosystem Data to answer not only EPA's Ethan McMahon's excellent questions (see next slide), but address the broader matter of: – Integration provides the right data to the right system or person in real time. – Analytics lets users develop insights using vast amounts of data to understand the past and anticipate the future. – Event processing combines the knowledge gained from analytics with real-time information to identify patterns of events and act to bring about the best outcomes. 14

EPA's Ethan McMahon's Excellent Questions EPA is planning to stand up a big data analytics service within the agency. We’d appreciate ideas from the ESIP community in a few areas: – 1. What problems have you tried to solve using data analytics and/or visualization? – 2. Are there any strategies or best practices you used to manage data within or between enterprise data systems? – 3. What techniques make sense for integrating large or varied data from multiple sources? – 4. What technologies have you used and how did you select them? – 5. Did you use any particular training resources for using big data analytics systems, and if so which ones? – 6. What lessons would be helpful for us to learn as we set up this service? We’re open to your ideas and we’re ready to share what we have learned. Please respond to me directly 15

Agenda 6:30 p.m. Welcome and Introduction (New Tutorial and Mentoring) SlidesSlides 6:45 p.m. Brief Member Introductions 7:00 p.m. David Portnoy, HHS IDEA Lab External Entrepreneur (see Washington Post Article: U.S. Turns to Private Sector) HHS IDEA LAB: Demand-Driven Open Data Slides and DiscussionDavid PortnoyWashington Post Article: U.S. Turns to Private SectorHHS IDEA LAB: Demand-Driven Open DataSlides 7:45 p.m.​ Brand Niemann, Data Science for EPA Big Data AnalyticsData Science for EPA Big Data Analytics – Note: DJ Patil, President's Chief Data Scientist (Invited-Conflict-Later Date) 8:30 p.m. Open Discussion 8:45 p.m. Networking 9:00 p.m. Depart 16