Data Science from 3,209 Feet John Chandler University of Montana and Ars Quanta.

Slides:



Advertisements
Similar presentations
To find out more or to apply, please visit our career portal and post your CV. goodyear-dunlop.com/career The Opportunity Develop and apply skill to analyze.
Advertisements

Dr Andy Pryke - The Data Mine Ltd An Introduction to R Free software for repeatable statistics, visualisation and modeling Dr Andy Pryke, The Data Mine.
Bob Muenchen (Pronounced Min’-chen) HelpDesk: Newsletter:
Statistical Analysis of Search Engine Results Reeshabh Gadda alias Shah, Sanjay Thakkar Department of Computer Science University of Southern California.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
Why?. Why are you learning this? “I’m [studying to be] a scientist, not a programmer. Why do I need to know how to program?”
CS 10051: Introduction to Computer Science What is the course about?
Programming Introduction November 9 Unit 7. What is Programming? Besides being a huge industry? Programming is the process used to write computer programs.
CESSDA Expert Seminar CESSDA Expert Seminar Odense, 11-12/9/2008 Presentation made by Dimitra Kondyli.
Introduction to R. Statistical Software Statistical software – Wide variety of software tools that researchers use to analyze data – Common examples are.
Air Quality Data Analysis Using Open Source Tools
CSE Fundamentals of Computing Prof. Douglas Thain Fall 2011.
© What do bioinformaticians do?
Custom Software Development for Clinical and Basic Research When Your Needs Go Beyond Standard Tools Andrew Rupert Open Source Team Lead John Stullenberger.NET.
Stern Center for Research Computing
© 2007 Pearson Addison-Wesley. All rights reserved 0-1 Spring(2007) Instructor: Qiong Cheng © 2007 Pearson Addison-Wesley. All rights reserved.
1 Dr Na Yao Phone apps, Computer Software Teaching EBU5502 Database (JP) EBU714U Security and Authentication (JP) ECS608U Distributed systems and Security.
Dr. Karl Abrahamson, Department Chair Dr. Amy Shannon, Academic Advisor.
computer
David R. McWilliams, Ph.D. Section of Statistical Genetics, Department of Biostatistical Sciences, Center for Public Health Genomics Bioinformatician IV.
Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent W. Freeh Dr. Kevin Bowyer Supported in part by the National Science.
Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions.
Luis F. Aguas Technological innovation Project Telecommunications and Digital TV.
Understanding the field & setting expectations.  Personal  International  UNT Alumni (Mathematics)  Academic  Economics & Mathematics  Professional.
Bioinformatics Curriculum Issues, goals, curriculum.
Data Mining Tools some examples.
Rich Web Applications for the Enterprise... Creating RWA from Your Oracle Database Presented By: John Krahulec Bizwhazee SEOUC Charlotte February 2009.
Topics to be presented Adv. Databases and Dataware Houses Topics to be presented by students 1.Indexing DW (Bitmap, MDX, X-Tree, UB-Tree, etc.) ( approx.
A Pictorial Introduction to Components in Scientific Computing.
3D Testing and Monitoring Lee Lueking LCG 3D Meeting Sept. 15, 2005.
Computer Science at USF Greg Benson Professor and Chair.
Software Development Introduction
Clustering in R Xue li CS548 showcase. Source html project.org/web/packages/cluster/index.html.
© 2014 IBM Corporation IBM SPSS Modeler Gold on Cloud Jump Start Service.
Analytical People 11 (When and) Why R wins EARL Conference 16 th September 2014 John McConnell – Analytical People Information and Data Management.
Computer Science A 1. Course plan Introduction to programming Basic concepts of typical programming languages. Tools: compiler, editor, integrated editor,
Perl By Warren David Cocke Greg Wallace Josh Johnson.
D ATA S CIENTISTS Who are they and what do they do?
1 Seattle University Master’s of Science in Business Analytics Key skills, learning outcomes, and a sample of jobs to apply for, or aim to qualify for,
What’s a Computer?. The Basics A computer is a machine that manipulates data based on a list of instructions called a program.
Information Management Services IMS was founded in 1973 Providing the NIH, pharmaceutical, academic, and other research organizations with biomedical computing.
Presenter: Bradley Green.  What is Bioinformatics?  Brief History of Bioinformatics  Development  Computer Science and Bioinformatics  Current Applications.
© 2011 LabKey Software LabKey Server Release 11.2 Atlas Developers Meeting 7/21/2011 Adam Rauch
TCCICOMPUTERCO ACHING.COM Diploma Programming Course In Ahmedabad.
Zohreh Raghebi.  A software platform provides an integrated environment  Machine learning  Data mining  Text mining  Predictive analytics  Business.
CX Introduction to Web Programming Introduction & Overview Prepared by: KAR First Prepared on: Last Modified on: xx-xx-xx Quality checked.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Python for data analysis Prakhar Amlathe Utah State University
High School Curriculum
Pixy Python API Charlotte Weaver.
Unit 6 – Assignment 1 (P2, P3 and P4)
Welcome to Computer Science!
7 Best Programming Languages Based as per Earnings & Opportunities
1 مفهوم ارتباطات ارتباطات معادل واژه communications ) ميباشد(. ارتباطات يك فرايند اجتماعي و دو طرفه است كه در آن اطلاعات مبادله شده و نوعي تفاهم بين طرفهاي.
Machine Learning with R
Coding Concepts (Basics)
MySQL Migration Toolkit
CSCI N207 Data Analysis Using Spreadsheet
New Simulation Specializations for AS and BS Degrees
Unit 6 part 3 Test Javascript Test.
Unit 6 part 2 Test Javascript Test.
An introduction to the Linux environment v
( OFFICE TOOLS AND JAVA USING NETBEANS)
Introduction to programming
Unit 6 part 5 Test Javascript Test.
Idaho Transportation Department (ITD) Chapman Munn Principal Research Analyst Highway Data Introduction.
Introduction to Computer Science
What is Programming Language
LANGUAGE EDUCATION.
BSC-MSC-IT Course at TCCI
Presentation transcript:

Data Science from 3,209 Feet John Chandler University of Montana and Ars Quanta

A Data Scientist Toolkit A scripting language (Python, C#, Java, Perl) A statistical computing language (R, SAS, SPSS) Database languages/environments (MSSQL, Oracle, Postgres, sqlite) Distributed computing environment (MapReduce, in many flavors) Fundamentally we are flipping bits, but this isn’t software development.

CRISP-DM, Shearer, 2000

Tools for data preparation A scripting language (Python, C#, Java) A statistical computing language (R, SAS, SPSS) Database languages/environments (MSSQL, Oracle, Postgres, sqlite) Distributed computing environment (MapReduce)

CRISP-DM, Shearer, 2000

Advice What is the simplest thing that could possibly work? Start small and expand scope. Use general tools. Bring uncertainty into the spotlight. Expect iteration. Clear-eyed evaluation of not competing on data.