Presentation is loading. Please wait.

Presentation is loading. Please wait.

Some Musings on Life & “Data Science” Statistical Learning and Data Science Friday Center, UNC, Chapel Hill J. S. Marron Dept. of Statistics and Operations.

Similar presentations


Presentation on theme: "Some Musings on Life & “Data Science” Statistical Learning and Data Science Friday Center, UNC, Chapel Hill J. S. Marron Dept. of Statistics and Operations."— Presentation transcript:

1 Some Musings on Life & “Data Science” Statistical Learning and Data Science Friday Center, UNC, Chapel Hill J. S. Marron Dept. of Statistics and Operations Research University of North Carolina

2 Some Views of Statistics Statistics Most People

3 Reality Some Views of Statistics Statistics Bayes Functional Data HDLSS Machine Learning Sparsity MCMC Kernels Bootstrap Survival Analysis Mixed Models Time Series Etc. Etc. Etc. …

4 Some Views of Statistics Statistics in Science Statistics

5 Some Views of Statistics Statistics in Science Statistics Medicine Biology Agriculture Psychology Economics Geology Physics

6 Some Views of Statistics John Tukey Quote: Statistics in Science From: http://www.morris.umn.edu/~sungurea/introstat/history/w98

7 Some Views of Statistics John Tukey Quote: “The best thing about being a statistician is that you get to play in everyone's backyard” Statistics in Science From: http://www.york.ac.uk/depts/maths/histstat/tukey_nytimes.htm

8 Some Views of Statistics Words coined by John Tukey:  Bit (0 – 1 data unit)  Software (mention to Computer Science friends…)

9 Some Views of Statistics Another Prescient Statistician: Bill Cleveland Coined the Term “Data Science” Cleveland, W. S. (2001). Data science: an action plan for expanding the technical areas of the field of statistics. International Statistical Review.

10 Some Views of Statistics Statistics Most People

11 Some Views of Statistics Statistics “Data Science (Analytics)”  Computer Science  Math (Applied)  Bus. / Finance  Others (Info. Sci., Psych, …)

12 Some Views of Statistics Statistics What is (should be) the relationship? Data Science Machine Learning … (Cleveland View)

13 Some Views of Statistics Statistics What is (should be) the relationship? Machine Learning … Data Science

14 The Big Question What are the Boundaries of Statistics? NSF/DMS Program Director (late 2004): “That is not statistics”

15 The Big Question What are the Boundaries of Statistics? OK, then where are they? We should discuss this much more… Openly, not in the “Rejection Process (Publications, Grants, etc.)”

16 Variation Thoughts From Business Statistics Course

17 Variation A Fundamental Concept:  Sounds Obvious  Easy to Not Consider (Forget) {Surprisingly So}

18 Variation  Easy to Not Consider (Forget) E.g. An Explorer Drowned in a Lake That Averaged 6 Inches in Depth… o Hard to visualize? Thanks to N. I. Fisher

19 Variation  Easy to Not Consider (Forget) E.g. An Explorer Drowned in a Lake That Averaged 6 Inches in Depth… o Hard to visualize? Lake Eyre, Australia, from Wikipedia

20 Variation  Easy to Not Consider (Forget) E.g. An Explorer Drowned in a Lake That Averaged 6 Inches in Depth… o Hard to visualize? Lake Eyre, Australia, from www.airadventure.com.au

21 Variation  Easy to Not Consider (Forget) E.g. An Explorer Drowned in a Lake That Averaged 6 Inches in Depth… o Hard to visualize? o Key is Variation About “Average” o Simple Idea Takes a Minute to Recall (happens a lot)

22 Variation A Fundamental Concept:  Sounds Obvious Common Gross Oversimplification: They are going to … They all want to.. Group of people: Political. Religious, Ethnic Origin, … U.S. Presidential Politics ?!?

23 Variation Homework C0.1 Find an Example of Ignoring Variation. Send me an email, with: text, and attribution. Plan to discuss in class.

24 Variation Homework C0.1 Results: Out of First 10 Quotes 9 Were From Donald Trump

25 Ideas on Human Relationships Common Question: “How Are Dep’t Politics Going?” Background:  Long Dubious History  Merger of Statistics & OR (More Diverse Interests)  Rapidly Changing University

26 Ideas on Human Relationships Response: “Best I’ve Seen in Chapel Hill” Reason: Respect Key to Current Interactions Moved Beyond “Politics of Disrespect”

27 Ideas on Human Relationships Fundamental Observation: Human Interactions Work Best In An Atmosphere of Respect  Day to Day Interactions w/ Colleagues  Reviews of Papers / Grant Proposals  US Congress  US Presidential Politics…

28 28 UNC, Stat & OR Special Thanks Department of Statistics and Applied Prob. National University of Singapore For Many Discussions  This Talk

29 29 UNC, Stat & OR BIG DATA Models & Concepts Challenge from the Recent Media: Mayer-Schönberger and Cukier (2014) “Big Data: A Revolution That Will Transform How We Live, Work, and Think”

30 30 UNC, Stat & OR BIG DATA Models & Concepts Challenge from the Recent Media: Mayer-Schönberger and Cukier (2014) Major Premise: Differing Data Analytic Goals “Correlational” vs. “Causal”

31 31 UNC, Stat & OR BIG DATA Models & Concepts “Causal” Data Analysis:  Goal: Underlying Causes of Phenomena  Approach: Classical “Scientific Method”  Formulate Hypothesis  Collect Data  Test Hypothesis  Consequences: Solid Knowledge w/ Measurable Certainty

32 32 UNC, Stat & OR BIG DATA Models & Concepts “Correlational” Data Analysis:  Goal: Find (and Use) Mere Correlations  Motivation: Correlations are  Useful (e.g. ___ Recognition Software)  Valuable (Buying and Selling of Data…)  Insightful????  Consequences: Automatic Solutions to Some Hard Problems

33 33 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion?

34 34 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Naïve Readers [Of Mayer-Schönberger and Cukier (2014)] : This is Exciting!!! Great New Ideas!!! Change Statistics Curricula!!! Start Up “Data Analytics”!!!

35 35 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Time Statistics

36 36 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning Time Statistics

37 37 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning ??? Time Statistics

38 38 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning Big Data – Data Science Time Statistics

39 A Small Aside A Personal Apology to Xiaotong Shen For My Skepticism About ASA Section on Data Mining My (Wrong) Idea: Name Would Change, So Not Appropriate as “Section” {Great to See Recent Name Change}

40 40 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning Big Data – Data Science Time Statistics

41 41 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning Big Data Some Came With Major New Ideas

42 42 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning Big Data Less So For Others, But More Focus On

43 43 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Data Mining Great Correlational Discovery

44 44 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Data Mining Great Correlational Discovery: Super Market Scanner Data Baby Diapers (aka Nappies) & Beer

45 45 UNC, Stat & OR Correlation vs. Causation How New Is This Discussion? Data Mining Baby Diapers (aka Nappies) & Beer Some Perspective:  Correlational Discovery  Makes Causational Sense (Too Soon To Totally Dump Causation)

46 46 UNC, Stat & OR Correlation vs. Causation Relative Emphasis???

47 47 UNC, Stat & OR Correlation vs. Causation Relative Emphasis??? Classical Statistics: Correlation vs. Causation

48 48 UNC, Stat & OR Correlation vs. Causation Relative Emphasis??? Mayer-Schönberger and Cukier: Correlation vs. Causation

49 49 UNC, Stat & OR Correlation vs. Causation Relative Emphasis??? Suggested Actual Future Course: Correlation & Causation

50 50 UNC, Stat & OR Correlation vs. Causation Relative Emphasis??? Suggested Actual Future Course: Correlation & Causation Note: Changes Are Needed in Curricula, Etc.

51 The Big Question What are the Boundaries of Statistics? NSF/DMS Program Director (late 2004): “That is not statistics”

52 The Big Question What are the Boundaries of Statistics? We Should Openly Discuss Much More… Statistics Data Science Machine Learning … OR Data Science Machine Learning … Statistics

53 The Big Question What are the Boundaries of Statistics? We Should Openly Discuss Much More… How Much Leadership Should We Take? Let’s Embrace Our Wide Diversity of Opinions on This Point

54 Challenges for You Lead Statistics (D. S.) into the Future Promote Increasing Breadth Embrace New Ideas Advocate Them While Reviewing Speak Up Serving On Panels Openly Discuss Boundaries


Download ppt "Some Musings on Life & “Data Science” Statistical Learning and Data Science Friday Center, UNC, Chapel Hill J. S. Marron Dept. of Statistics and Operations."

Similar presentations


Ads by Google