Download presentation
Presentation is loading. Please wait.
Published byAlicia Wilkins Modified over 8 years ago
1
Data Science from 3,209 Feet John Chandler University of Montana and Ars Quanta
4
A Data Scientist Toolkit A scripting language (Python, C#, Java, Perl) A statistical computing language (R, SAS, SPSS) Database languages/environments (MSSQL, Oracle, Postgres, sqlite) Distributed computing environment (MapReduce, in many flavors) Fundamentally we are flipping bits, but this isn’t software development.
6
CRISP-DM, Shearer, 2000
19
Tools for data preparation A scripting language (Python, C#, Java) A statistical computing language (R, SAS, SPSS) Database languages/environments (MSSQL, Oracle, Postgres, sqlite) Distributed computing environment (MapReduce)
20
CRISP-DM, Shearer, 2000
27
Advice What is the simplest thing that could possibly work? Start small and expand scope. Use general tools. Bring uncertainty into the spotlight. Expect iteration. Clear-eyed evaluation of not competing on data.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.