UNLOCKING THE SECRETS HIDDEN IN YOUR DATA PART 1 Data and Data Analysis.

Slides:



Advertisements
Similar presentations
Brought to you by Tutorial Support Services The Math Center.
Advertisements

EXPERIMENTAL DESIGN STATEMENT OF PROBLEM Not a yes or no answered question Problem should be clearly testable and specific to your investigation.
BCIS IB (Test 2) Excel Lessons 4 – 8 Press space bar to Advance Frame.
The goal of data analysis is to gain information from the data. Exploratory data analysis: set of methods to display and summarize the data. Data on just.
Using Excel for Data Analysis in CHM 161 Monique Wilhelm.
1 Summary Statistics Excel Tutorial Using Excel to calculate summary statistics Prepared for SSAC by *David McAvity – The Evergreen State College* © The.
1 Summary Statistics Excel Tutorial Using Excel to calculate descriptive statistics Prepared for SSAC by *David McAvity – The Evergreen State College*
MEASURES OF CENTRAL TENDENCY & DISPERSION Research Methods.
Statistical Analysis I have all this data. Now what does it mean?
Math 116 Chapter 12.
Excel – Lesson 1 Pasewark & PasewarkMicrosoft Office 2007: Introductory 1 Entering a Formula (continued) Formulas can include more than one operator. The.
Computer Literacy BASICS
Describing distributions with numbers
© 2006 Baylor University EGR 1301 Slide 1 Lecture 18 Statistics Approximate Running Time - 30 minutes Distance Learning / Online Instructional Presentation.
BIOSTAT - 2 The final averages for the last 200 students who took this course are Are you worried?
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
GUTS Youth Leadership Corps Things you need to know.
CHAPTER 1 Basic Statistics Statistics in Engineering
UNLOCKING THE SECRETS HIDDEN IN YOUR DATA
Statistical Analysis with Excel (PREVIEW). Spreadsheet Programs First developed in 70s –VisiCalc Dan Bricklin and Bob Frankston –Operated on Apple II.
MATH125 Chapter 3 topics ANALYZING DATA NUMERICALLY.
UNLOCKING THE SECRETS HIDDEN IN YOUR DATA Data Analysis.
Statistical Analysis I have all this data. Now what does it mean?
Analyzing and Interpreting Quantitative Data
Table of Contents 1. Standard Deviation
Make observations to state the problem *a statement that defines the topic of the experiments and identifies the relationship between the two variables.
Chapter 21 Basic Statistics.
Unit 9: Probability, Statistics and Percents Section 1: Relative Frequency and Probability The frequency of something is how often it happens Relative.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.
Perform Descriptive Statistics Section 6. Descriptive Statistics Descriptive statistics describe the status of variables. How you describe the status.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Statistical Analysis with Excel. Learning Objectives Be able to use the Dial Caliper to measure Be able to use Microsoft Excel to –Calculate mean, median,
Research Methods in Politics CHapter 13 1 Research Methods in Politics 13 Calculating and Interpreting Descriptive Statistics.
Probability and Statistics 12/11/2015. Statistics Review/ Excel: Objectives Be able to find the mean, median, mode and standard deviation for a set of.
Introduction to Excel EC 151 Principles of Microeconomics Block 3,
ANNOUCEMENTS 9/3/2015 – NO CLASS 11/3/2015 – LECTURE BY PROF.IR.AYOB KATIMON – 2.30 – 4 PM – DKD 5 13/3/2015 – SUBMISSION OF CHAPTER 1,2 & 3.
Statistics Vocabulary. 1. STATISTICS Definition The study of collecting, organizing, and interpreting data Example Statistics are used to determine car.
Statistics © 2012 Project Lead The Way, Inc.Principles of Engineering.
An Introduction to Statistics
Descriptive Statistics
INTRODUCTION TO STATISTICS
EMPA Statistical Analysis
Descriptive Statistics
Analysis and Empirical Results
Statistics Principles of Engineering © 2012 Project Lead The Way, Inc.
Statistics Statistics Principles of EngineeringTM
Statistical Analysis with Excel
Introduction to Summary Statistics
4. Finding the Average, Mode and Median
Microsoft Office Illustrated
Introduction to Summary Statistics
Introduction to Summary Statistics
Success Criteria: I will be able to analyze data about my classmates.
Statistics Statistics Principles of EngineeringTM
Statistical Analysis with Excel
Introduction to Summary Statistics
Statistical Analysis with Excel
Introduction to Summary Statistics
Basic Statistical Terms
Statistics Principles of Engineering © 2012 Project Lead The Way, Inc.
Statistics Statistics- Inferential Statistics Descriptive Statistics
Introduction to Summary Statistics
Microsoft Excel 101.
Statistics Principles of Engineering © 2012 Project Lead The Way, Inc.
Welcome!.
Introduction to Summary Statistics
Introduction to Summary Statistics
HIMS 650 Homework set 5 Putting it all together
Introduction to Summary Statistics
Presentation transcript:

UNLOCKING THE SECRETS HIDDEN IN YOUR DATA PART 1 Data and Data Analysis

Data What is Data? Data is information gathered from observation, experimentation or modeling  Qualitative – not precise (usually descriptive)  Quantitative - Precise (usually numeric) The output of your model (i.e. number of healthy agents, number of infected agents, time…)

Data How do we gather data? Data collection is the systematic recording of information while changing Variables (a quantity that may assume any given value or set of values). Collect the output (i.e. number of healthy agents, number of infected agents, time…) while changing the variables (number of devils, number initially infected) of the model

Data Why should we get data? To answer questions To develop understanding To validate experiments What should we do with data? Display – usually graph it to make it easier to see trends Analysis – use math skills to uncover patterns and trends in data sets Interpretation - involves possible explanation those patterns and trends.

Extracting Data from StarlogoTNG There are three ways to extract data from StarlogoTNG  Collect the data by hand  Create a chart in Starlogo TNG and extract the data to Excel  Create a table in Stalogo TNG and extract the data to Excel

Why Should We Display Data ? What did you see? Makes your data visible Helps find obvious patterns Does the data makes sense?  Are your assumptions correct?  Did you collect enough data?

Why Should We Analyze Data ? What does it Mean? Is there is more information in the data  emergent behavior  unexpected patterns Was the hypothesis correct ? Why Does it Matter? Draw conclusions from data  More grass gives more rabbits To help you answer questions Provide visible evidence and support for our conclusions to you audience (e.g. Challenge judges) Validity of model, experiment, theory, …

Ways to Analyze Data Plotting Data  Ways to visually understand data Statistics  Makes it easier to compare data  Mean, Median, Mode  Makes it clear if you have NOISY data  Range, Variance, Standard Deviation

Ways to Analyze Data Derivatives (Slopes)  Tell if changes in parameters affect data  Parameter 2 has a greater effect than Parameter 1  Get more information from data Slope = 0.08 Slope = 0.16 Slope = 0.39 Great Derivative

Collecting Data: Variable Sweeping Did you collect enough data?  Did you vary the parameters throughout their ranges?  If you have sliders (input variables) in your program, you need data for the full range of those sliders. Minimum 3 runs for a single variable (low, medium, high)  More than one slider (variable), must vary them separately. 2 variable perhaps 9 runs

Collecting Data from Starlogo TNG Gathering Data by hand  Tasmanian Devils  Variable sweep  More than one variable  Multiple runs at each variable combination  Average the data

Collecting Data from Starlogo TNG Lets Do It  Open Tasmanian Devil  Run a section of the data sheet  Do variable sweep  Initial Population  Initial Percent Infected  Multiple runs at each set of variables  Collect output in data sheet  Number healthy after 200 ticks

Collecting Data from Starlogo TNG Put Data into Excel Calculate Averages

Collecting Data from Starlogo TNG Make a Summary Table Create XY Charts

Collecting Data from Starlogo TNG Make a 3D Chart

Plotting Data – Extracting from Starlogo TNG  Data can be extracted from a graph or a table in Starlogo TNG  Create a graph using the line graph block  Put reset clock on Setup block to clear and reset graph LET’S DO IT – Tasmanian Devils !!

Plotting Data – Extracting from StarlogoTNG  After program is run  Click on graph in Spaceland  Save File – Excel file LET’S DO IT – Tasmanian Devils !!

Data Analysis: Plotting Data – Types of Plots All plots from Pie Charts – music preference Pets purchased at pet store Bar Charts – preferred snacks

Data Analysis: Plotting Data – Types of Plots All plots from XY Graphs – cell phone use Scatter Plots

Plotting Data – Activity in Excel Open Tasmanian Devil Export file (csv file ) by double clicking on the file In EXCEL - Insert Chart Select type of chart  XY Scatter Hit the Next button LET’S DO IT

Plotting Data – Activity in Excel Select Data Range Highlight data to be plotted

Plotting Data – Activity in Excel Label each data series NEXT - Label Graph and Axis

Plotting Data – Activity in Excel Choose where you want the graph to be Get your graph

Plotting Data – Extracting from Netlogo Two ways  1 st Way: Write code to extract the data you want – see File Output Example in the Code Examples  Open file in setup procedure  Create a write-to-file procedure

Plotting Data – Extracting from Netlogo 2 nd way: Extract data from Netlogo graphs  Have Netlogo generate graph on Interface page (example on later slide)  Create a setup-plot procedure and a do-plot procedure  Call the setup-plot procedure in setup procedure  Call do-plot procedure in go procedure

Plotting Data – Extracting from Netlogo  Run model until sufficient data obtained  (PC) Right Click on Graph/ (Mac) Control Click on Graph  Select Export  Choose location and File name - select save  Excel File is created – Next Slide Contains all the information in the plot and input parameters used. Contains excess information about the plot (color, pen down, mode, interval…) LET’S DO IT – Open Rabbits Grass Weeds

Plotting Data – Extracting from Netlogo This is what You need

Statistics Statistics help you  Summarize data  Describe data  Analyze data Hard to describe the difference Between the two data sets Now it is easy to summarize, describe and analyze the data…. The blue and the pink data have the Same AVERAGE value (mean) but the blue data is “NOISIER” (greater standard deviation). Therefore…

Statistics – How to Calculate in Excel +,-,*,/ used for addition, subtraction, multiplication and division. Each cell has a label based on the column and row. Use cells to perform calculations instead of numbers. Example : =(A4+B4)/C4 Perform calculations on an entire column - copy and paste the equation.Warning : this changes the cell number for each line. Fix a specific cell - use the $ symbol, example (A4+B4)/$C$1 Excel has many built in statistical functions Makes life easy! E1

Calculate in Excel Activity Open a blank spread sheet in Excel Create 2 columns of numbers Then Add, Subtract, Multiple and Divide the first row Copy and paste the formulas

Statistics – Measurements of Central Tendency Mean (Average), Median, and Mode Definitions  Mean (Average) – Sum divided by the number of data points  Median – Middle data point when arranged from highest to lowest  Mode – Most frequent value Use data set to calculate Mean (Average) Median, Mode, Max and Min  Select Cell where you want the value of the function to appear  Select Insert then Function  Select Statistical  Select function wanted (AVERAGE, MEDIAN, or MODE) then hit OK  Select Range of data you want to analyze by clicking on range symbol and highlighting range. Hit enter or OK LET’S DO IT : StarlogoTNG : Fish and Plankton data Netlogo : Rabbits and Grass data

Statistics – Measurements of Data Spread Range, Variance and Standard Deviation Definitions  Range = maximum - minimum  Variance = measures noise of the data around the mean value.  Standard Deviation (S) is the square root of the variance. Most commonly used measure of spread (same units as the data). Another reason to use S: ~68% of the data are in the interval Mean – S to Mean + S ~95% of the data are in the interval Mean – 2 S to Mean + 2 S ~99% of the data are in the interval Mean – 3 S to Mean + 3 S EXCEL does it for you!!! LET’S DO IT : StarlogoTNG : Fish and Plankton data Netlogo : Rabbits and Grass data

Derivatives What are Derivatives?  A simple calculation using data  Instantaneous rate of change = SLOPE Why use Derivatives?  Get more information from data  More Ways to comparison data  Car moving down a road  Data = the distance traveled  Velocity = the 1 st derivative of distance  Acceleration = 2 nd derivative of distance = the 1 st derivative of velocity Slope of distance Slope of velocity

A Note on Randomness This data is not RANDOM Random means that there is an equal probability of getting each outcome (like rolling a die) There is scatter in the data but it is not random

Other Things to Think About Is there “scatter” in your model?  Evaluate how the “scatter” effects your results – repeat model runs  Make sure you get enough data to get a good statistics Did you collect enough data?  Did you let the model run long enough? Has the model reached “equilibrium”