Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Science for Big Data Application and Analytics MOOC Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community

Similar presentations


Presentation on theme: "Data Science for Big Data Application and Analytics MOOC Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community"— Presentation transcript:

1 Data Science for Big Data Application and Analytics MOOC Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community http://semanticommunity.info/ http://www.meetup.com/Virginia-Big-Data-Meetup/ http://www.meetup.com/Federal-Big-Data-Working-Group/ http://www.meetup.com/Northern-Virginia-Semantic-Web-Meetup/ http://semanticommunity.info/Data_Science/Federal_Big_Data_Working_Group_Meetup March 2, 2015 1

2 Introduction Welcome: – Federal Big Data Working Group Meetup – Virginia Big Data Meetup – Lotico Northern Virginia Semantic Web – NEW: Natural Medicines for Health and Wellness 2

3 Federal Big Data Working Group Meetup Federal: Supports the Federal Big Data Initiative, but not endorsed by the Federal Government or its Agencies; Big Data: Supports the Federal Digital Government Strategy which is "treating all content as data", so big data = all your content; Working Group: Data Science Teams composed of Federal Government and Non-Federal Government experts producing big data products; and Meetup: The world's largest network of local groups to revitalize local community and help people around the world self-organize like MOOCs (Massive Open On-line Courses) being considered by the White House 3

4 The Profit and Data Enterprises Marcus Lemonis (born November 16, 1973) is a Lebanese-born American businessman, investor, television personality and philanthropist. He is currently the chairman and CEO of Camping World and Good Sam Enterprises, and the star of The Profit, a CNBC reality show about saving small businesses through People, Process, and Products. – http://en.wikipedia.org/wiki/Marc us_Lemonis http://en.wikipedia.org/wiki/Marc us_Lemonis The Federal Big Data Working Group Meetup is also about helping government agencies develop: – People – Data Scientists – Process – Data Infrastructure – Products – Data Publications Some examples: – EPA – FDA – NOAA – HHS – Eastern Foundry And provide MOOCs for training and networking. (Massive Open Online Courses) 4

5 Top 5 MOOCs for Data Science 5 COURSEORGANIZATIONNOTES Machine LearningCoursera (Standford)One of the first MOOCs Intro to Data ScienceCoursera (U of Washington)Starts in April 2013 Intro to Statistics Making Decisions Based on Data UdacityEnroll anytime Introduction to Infographics and Data Visualization Knight Center @ U of TexasStarts January 12, 2013 Learning From DataCalTechStarts Jan 8, 2013 http://101.datascience.community/2012/12/26/top-5-moocs-for-data-science/

6 Five MOOCs for Big Data Applications and Analytics Practical Data Science for Data Scientists by Niemann Based on Schutt and O’Neil Book Data Science for Data Mining by Niemann Based on North Book and Borne Class Federal Big Data Working Group Meetups by Niemann and Goodier Tackling the Challenges of Big Data, MIT ProfessionalX Online Course by Niemann Based on Rus and Madden MOOC Data Science for Big Data Application and Analytics MOOC by Niemann Based on Geoffrey Fox MOOC Data Science for ​Mining of Massive Datasets by Niemann Based on Stanford MOOC (IN PROCESS) 6 See: Top 5 MOOCs for Data ScienceTop 5 MOOCs for Data Science

7 Calendar NITRD FASTER Bigdata at NSF, February 17, 2015: – To be rescheduled Mission Source Consulting Launch Party, February 28: – Pre-launch of Natural Medicines for Health and Wellness Meetup 5 th Annual Government Big Data Forum, March 12, 2015 USDA CIO and ACDO on Open Data Plan and Roundtable, March 16, 2015 Government Technology & Innovation Incubator for Big Data Analytics II, TBA. Week of March 23, Need Sponsor Data Science for HealthData.gov Developers & Family Caregivers. April 6, 2015 – David Portnoy out of the country – working on replacement The Wharton DC Alumni Innovation Summit, April 28-29, 2015 President's Chief Data Scientist and EPA Big Data Analytics, April 20, 2015 – David Portnoy helping organize Data Science for Natural Medicines and Epigenetics, May 4, 2015 – Dr. Joel D. Wallach: Epigenetics 7

8 Data Science for MyFamilySearch.org and FamilyTree DNA, February 16 th January 13, 2015: Family Search Launches New App Gallery (more than 50 apps)App Gallery February 12–14, 2015: RootsTech 2015 Developer Challenge in Salt Lake City, Utah My Entry: Big Data from Everywhere for Families and Community Service My Partner Work: Data Science for MyFamilySearch.org Syed Ali’s App: National Geographic Genographic Project and Big Data You could be a partner and develop apps (e.g. A Billion Person Family Tree with MongoDB by Randall Wilson, Family Tree of Data: Provenance and Neo4, etc.) 8

9 FamilySearch.org “FamilySearch is a great resource, but FamilySearch alone can’t do everything. That is why we work with partners to provide complementary tools and resources and why the FamilySearch App Gallery is so important,” said Dennis Brimhall, FamilySearch CEO. “We’ve had partners for many years, and now we want to make it easier for our patrons to know about them and to find the apps they need.” 9

10 MyTableBox of MyFamily Tree 10 http://semanticommunity.info/MyFamilySearch.org#MyTableBox_of_MyFamily_Tree

11 Person Template for Brand Lee Niemann 11 http://semanticommunity.info/MyFamilySearch.org#Person_Template_for_Brand_Lee_Niemann

12 Mini-Tutorial: Sony Camcorder and Camtasia Video to YouTube Video How is the data collected? – Sony Camcorder and PowerPoint Slides. Where is the data stored? – Hard drive and DVD in MP4 format. What are the results? – MP4 files converted and uploaded to YouTube. Why should we believe the results? – Because I and others have done it successfully many times. 12

13 Data Science for Natural Medicines 13 YouTube

14 Big Data Symposium at the National Research Council, October 23, 2014 Symposium on the Interagency Strategic Plan for Big Data: Focus on R&D 8:45 Session chair: Clifford Lynch, CNI (Conflict) Overview Presentation Allen Dearry, NIH (North Carolina) and Five Strategic Areas: – 1. Technologies, Howard Wactlar, NSF – 2. Knowledge to action, Peter Lyster, NIH (Previous Meetup) – 3. Sustainability, Sky Bristol, USGS (by videoconference) – 4. Education, Michelle Dunn, NIH – 5. Gateways, Kamie Roberts, NIST – 10:15 Break – 10:30 Session chair: Alexa McCray, Harvard Medical School (Conflict) Panel discussion: Response to Five Strategic Areas and Comments – Keith Clarke, UC Santa Barbara – Kirk Borne, George Mason University (Previous Meetup) – Jane Snowdon, IBM Federal 14 http://sites.nationalacademies.org/PGA/brdi/PGA_152373 Note: This is where I started to organize a Meetup, but now we have gone beyond this.

15 Big Data Application and Analytics MOOC: Email The email said: Check Out Professor Geoffrey Fox's MOOC Starting December 1st. Folks, Here is the link to the "Big Data Applications and Analytics" MOOC: – https://bigdatacourse.appspot.com/preview https://bigdatacourse.appspot.com/preview For big data novices, this is a gentle introduction. For big data experts, this exposes Professor Fox's perspectives and insights and source references without all the details and mathematical models. 15

16 Data Science for Big Data Application and Analytics MOOC I downloaded the ZIP file of Course Syllabus (PDF), Slides (PDF) and Python Files and explored them. Then I mined and structured the content into MindTouch to build a Knowledge Base of the essence of Professor Geoffrey Fox's MOOC so I could make it part of the Federal Big Data Working Group Meetup MOOC. I asked Professor Fox: Are there data sets used in the course? His reply was: Only a few small sample datasets and simple Monte Carlo sets. So I set about to find them and reuse them in Spotfire Statistics and Visualizations. 16

17 Data Science for Big Data Application and Analytics MOOC: Knowledge Base 17 Data Science for Big Data Application and Analytics MOOC

18 Big Data Application and Analytics MOOC Section 1: Introduction Section 2: Overview of Data Science: What is Big Data, Data Analytics and X-Informatics? See Next Slide Section 3: Technology Training Section 4 - Physics Case Study Section 5: Technology Training Section 6 - e-Commerce and LifeStyle Case Study Section 7 - Infrastructure and Technologies for Big Data X-Informatics Section 8 - Web Search Informatics Section 9 - Technology for X-Informatics Section 10 - Health Informatics Section 11 - Sensor Informatics Section 12 - Radar Informatics Section 13 Spotfire: Spotfire Recommendations for Analytic Data Publications Note: My Section 18

19 Semantic Web and Big Data: Features of Data Deluge Semantic Web/Grid Versus Big Data: – Original vision of Semantic Web was that one would annotate (curate) web pages by extra "meta-data" (data about data) to tell web browser (machine, person) the "real meaning" of page – The success of Google Search is "Big Data“ approach; one mines the text on page to find "real meaning" – Obviously combination is powerful, but the pure "Big Data" method is more powerful than expected 15 years ago 19 Link

20 Agenda 6:30 p.m. Welcome and Introduction (New Tutorial and Mentoring) Slides Big Data Symposium at the National Research Council, October 23, 2014, Slides and Data Science for Big Data Application and Analytics MOOC Also See: Top 5 MOOCs for Data Science and Spotfire Recommendations for Analytic Data PublicationsBig Data Symposium at the National Research CouncilSlidesData Science for Big Data Application and Analytics MOOCTop 5 MOOCs for Data ScienceSpotfire Recommendations for Analytic Data Publications 7:10 p.m. Brief Member Introductions 7:15 p.m.​ Professor Geoffrey Fox (remote), Director of the Digital Science Center and Associate Dean for Research and Graduate Studies at the School of Informatics and Computing, Big Data Applications and Analytics MOOC Also See: Community Grids Laboratory Web Sites and Future SystemsBig Data Applications and Analytics MOOCCommunity Grids Laboratory Web SitesFuture Systems 8:30 p.m. Open Discussion 8:45 p.m. Networking 9:00 p.m. Depart 20 http://www.meetup.com/Federal-Big-Data-Working-Group/events/218925651/


Download ppt "Data Science for Big Data Application and Analytics MOOC Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community"

Similar presentations


Ads by Google