Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introductory Statistics and Data Analysis

Similar presentations


Presentation on theme: "Introductory Statistics and Data Analysis"— Presentation transcript:

1 Introductory Statistics and Data Analysis
MAT 135 Introductory Statistics and Data Analysis Adjunct Instructor Kenneth R. Martin Lecture 2 September 7, 2016

2 Confidential - Kenneth R. Martin
Agenda Housekeeping Readings Collect HW #1 Review HW #1 Chapter 1, 14 & 10 Quiz #1 Confidential - Kenneth R. Martin

3 Confidential - Kenneth R. Martin
Housekeeping Read, Chapter 1.1 – 1.4 Read, Chapter 14.1 – 14.2 Read, Chapter 10.1 HW #2 to be issued Confidential - Kenneth R. Martin

4 Confidential - Kenneth R. Martin
Housekeeping Connect Math Course code: HHGCW-JNEMX Confidential - Kenneth R. Martin

5 Confidential - Kenneth R. Martin
Housekeeping Collect HW #1 Confidential - Kenneth R. Martin

6 Confidential - Kenneth R. Martin
Housekeeping Review HW #1 Confidential - Kenneth R. Martin

7 Confidential - Kenneth R. Martin
Housekeeping Quiz #1 Beginning of next class, 9/14/2016 Material and associated readings covered through today. Confidential - Kenneth R. Martin

8 Statistics - Definition
Statistics: the science of the collection, analysis, interpretation and use of data relating to any group or groups of individuals, experiments, or data points (scores or raw scores). Also includes the planning of the collection of data, in terms of the design of surveys, experiments, etc. Confidential - Kenneth R. Martin

9 Statistics – Application to Research
Confidential - Kenneth R. Martin

10 Statistics - Definition
Descriptive Statistics: summarizing the data by describing what was observed in the sample taken, either numerically or graphically (tables, graphs, or single values). Inferential (Inductive) Statistics: drawing generalizations about the population represented from a smaller data sample, taking into account randomness. Confidential - Kenneth R. Martin

11 Statistics - Definition
Variable: a characteristic or attribute that can take on different values. Confidential - Kenneth R. Martin

12 Statistics - Definition
Data (plural): measurements or observations that are usually numeric. Datum (singular): a single measurement or observation, usually referred to as a score or raw score. Confidential - Kenneth R. Martin

13 Confidential - Kenneth R. Martin
Statistics Population and Sample Population: The collection of all possible individuals, subjects, times, places, units, etc., which we wish to study The data collected in the population is referred to as “Parameters” Confidential - Kenneth R. Martin

14 Confidential - Kenneth R. Martin
Statistics Population and Sample Sample: The particular subset of individuals, subjects, times, places, units etc. from which measures are obtained It is a representative subset of the population The data collected in the sample is referred to as “Statistics”. Confidential - Kenneth R. Martin

15 Confidential - Kenneth R. Martin
Statistics Population and Sample Sample data shall be randomly drawn and representative of the entire population. Random means the sample drawn has an equal chance of being selected Samples are a limited number (n) of a larger source N (population) Samples shall be ~  30 to be statistically valid. Confidential - Kenneth R. Martin

16 Confidential - Kenneth R. Martin
Statistics Population and Sample SAMPLE The particular individuals, subjects, times, places, units etc. from which measures are obtained POPULATION The collection of all possible individuals, subjects, times, places, units, etc. which we wish to study SAMPLING SCHEME The rules by which we will choose which individuals, etc. to include in the sample Confidential - Kenneth R. Martin

17 Confidential - Kenneth R. Martin
Statistics Why collect samples ? Often impractical to collect all the data from the entire population (i.e. U.S. census). Some test methods are destructive – we wouldn’t have any products or services left to ship to a customer! Too expensive to sample the entire population. Don’t have to collect 100% of the population ! We can use inferential statistics to make sound conclusions about the population. Population and Sample Sampling Scheme POPULATION SAMPLE Measure Data! Use data from the SAMPLE to make conclusions about the POPULATION Confidential - Kenneth R. Martin

18 Confidential - Kenneth R. Martin
Statistics Sample vs. Population (differentiation) Bradford County Confidential - Kenneth R. Martin

19 Confidential - Kenneth R. Martin
Statistics Sample vs. Population (differentiation) United States Confidential - Kenneth R. Martin

20 Confidential - Kenneth R. Martin
Statistics Sample vs. Population (differentiation) When the sample size becomes large, we start treating the data as if it is the population. When a particular “product / process” has a finite life, then you have all the data, and thus it is the population. When you’re confident you’ve seen all possible variations, then you can treat the sample data as the population. Confidential - Kenneth R. Martin

21 Confidential - Kenneth R. Martin
Statistics Sample vs. Population (differentiation) 250 dot sample 1000 dot sample 2000 dot sample This example is simply designed to illustrate that as the sample size goes up the sample will more clearly illustrate with is in the population. Confidential - Kenneth R. Martin

22 Statistics -Characteristics of a Good Sampling Scheme
Generates samples that are representative of the population Uses a random selection mechanism Each element of the population has an equal chance of being in the sample Unbiased selection of samples Some sort of objective random process used to select the elements in the sample The 3 items in red are the primary characteristics of good samples that students need to know. Sampling Scheme Confidential - Kenneth R. Martin

23 Statistics – Random Sampling
Population Sample XXXXXXXXXX XXXXX Basic Idea of Random Sampling Each unit in the population has an equal chance of being selected Elements are not selected based on a preliminary look at their measurement values Some additional rules may be used, however, to assure all elements of the population are considered Confidential - Kenneth R. Martin

24 Statistics – Sampling Variations
Purely Random In a figurative sense: Put all the population in a hat; reach in and randomly select the individuals to include in the sample. Or use a random number generator. Example: Randomly select 100 inquiries from the 5000 received last month. Systematic Sampling Select every kth individual in the population (assumes that the population individuals are ordered in some way that is not related to what you are measuring) Example: Randomly select every 50th inquiry from the 5000 received last month. Stratified Sampling If population has subgroups that may be related to what you are measuring, randomly select a certain number from each subgroup, for the sample. Select more from larger subgroups, fewer from smaller ones, assuring proportionality. Example: “Since 35% of the inquiries were about our software products, 40% about hardware, and 25% were about other things, randomly select 35 software inquiries, 40 hardware inquiries, and 25 other inquiries from the total of 100 received.” Confidential - Kenneth R. Martin

25 Statistics – Research Methods
Definitions Science: The study of some phenomena, through strict observation, evaluation, interpretation, and theoretical explanation. Experiment: Any study that demonstrates cause, by following strict procedures to ensure all other possible causes are eliminated or highly unlikely. Confidential - Kenneth R. Martin

26 Statistics – Research Methods
Definitions Observational study: the researcher simply observes what is or what did happen, and tries to draw conclusions based upon these observations. Experimental study: the researcher manipulates one (or more) of the variables and tries to determine how the manipulation influences the other variables. Confidential - Kenneth R. Martin

27 Statistics – Research Methods
Definitions Independent Variable (IV): The variable that is manipulated in an experiment. It is the “presumed cause”. Dependent Variable (DV): The variable that is measured in each group of the study. It is the “presumed effect”, and is believed to change in the presence of the IV. Confounding Variable (CV): a variable that influences the DV, but was not completely separated from the IV. Confidential - Kenneth R. Martin

28 Statistics – Research Methods
Definitions Operational Definition: A definition of how we will measure the DV. Confidential - Kenneth R. Martin

29 Statistics – Research Methods
Experimental Study Requirements that must be satisfied for the Experimental Study, to allow the researcher to be able to draw cause-effect conclusions. Manipulation - of the variables that operate in an experiment. Randomization - of assigning participants to conditions Comparison / Control - something to compare the results Confidential - Kenneth R. Martin

30 Statistics – Research Methods
Experimental Study Q: Do distractions during an exam effect student scores ? Requirement 1 Requirement 2 Requirement 3 Confidential - Kenneth R. Martin

31 Statistics – Research Methods
Correlational Method (Scatter Diagram) Mathematical representation of two (or 3) variables, using Cartesian coordinates, for a common data set Identifies potential relationships and may* suggest causation (if applicable) Independent Variable on x-axis, Dependent Variable on y-axis X is input to some “process”, and Y is the output X variable can be incremented or decremented.

32 Statistics – Research Methods
Correlational Method (Scatter Diagram) Will confirm a hypothesis if variables are related Observe direction and “tightness” of data Closer to a straight line, the greater the correlation between the variables.

33 Statistics – Research Methods
Correlational Method (Scatter Diagram) Can test the strength of the relationship mathematically (Correlation coefficient) Can only say X & Y are related, not one caused other

34 Statistics – Research Methods
Scatter Diagram (positive correlation)

35 Statistics – Research Methods
Scatter Diagram (negative correlation)

36 Statistics – Research Methods
Scatter Diagram – 3D Scatter & Contour Plot

37 Statistics - Definition
Data: Variables: those items that are measurable Continuous: measurable characteristics that can take on any value, within measurement capability. (i.e. temperature, distance, etc.) [Quantitative] Discrete: countable characteristic that can only take on specific values (integer). (i.e. # defects, # failure, # people, etc.) [Quantitative or Qualitative] Confidential - Kenneth R. Martin

38 Statistics - Definition
Data: Attributes: those items that are either conforming or non-conforming to a specification. (i.e. go-no/go; pass / fail; black / white; I / O, etc.) [Qualitative] Confidential - Kenneth R. Martin

39 Statistics - Definition
Data: Quantitative: varies by amount Qualitative: varies by class / category Confidential - Kenneth R. Martin

40 Continuous vs. Discrete vs. Attribute Data
infinite # of possible measurements in a continuum 1 2 3 4 5 6 7 8 9 10 Discrete: Count Discrete: Ordinal 1 2 3 4 5 6 7 8 9 10 “low”/“small”/“short” “high”/”large”/”tall” “medium” / “mid” Discrete: Nominal or Categorical defines several groups - no order or ranking Group A Group B Group C Group D Group E Group F Attribute: Binary “bad”/“no-go”/”group #1” “good”/“go”/”group #2 defines TWO groups - no order Confidential - Kenneth R. Martin

41 Continuous vs. Discrete vs. Attribute Data
Continuous: can take on any measurement value, within measurement capability. i.e. weight of people Count: can only take on a specific integer value. i.e. the number of people in a room Ordinal: shows order and simply one rank is greater than or less than another. i.e. shirt sizes. Nominal / Categorical: groupings of common items / values. Groups do not imply any importance, order, or additional information. i.e. seasons of birth Binary: defines two possible outcome, and no order. i.e. standard light switch (on / off) Confidential - Kenneth R. Martin

42 Statistics - Definition
Accuracy: how close are measured values to the true (actual) value. Precision: how close the measured values are to each other. Low accuracy Low precision High accuracy High precision High accuracy High precision Confidential - Kenneth R. Martin

43 Confidential - Kenneth R. Martin
Statistics Resolution: The measurement device shall be able to read to at least one decimal place beyond the discerned level needed. Example, if you are interested in weight of a person, and only interested to the nearest pound, the scale shall be able to measure to a tenth of a pound. Confidential - Kenneth R. Martin

44 Confidential - Kenneth R. Martin
Statistics Numbers: When rounding, determine the resolution of data needed. Then look to the digit immediately to the right. If that digit ends with 0-4, keep the previous number (round down) If that digit ends with 5-9, round the previous digit up Confidential - Kenneth R. Martin

45 Confidential - Kenneth R. Martin
Statistics Numbers: Significant figures: the number of digits needed to locate the decimal point, exclusive of leading zeros significant digits significant digits significant digits Confidential - Kenneth R. Martin

46 Confidential - Kenneth R. Martin
Statistics Numbers: Multiplication / division / exponentiation: the number with the fewest sig. figures dictates 6.59 * 2.3 = = 15 (2 sig. digits) Addition / subtraction: the answer has no more sig. figures after decimal than # with fewest after decimal. = 4.29 = 4.3 8.1 x = 6.9 x 103 Confidential - Kenneth R. Martin

47 Confidential - Kenneth R. Martin
Statistics Math rule: Please (Parenthesis) Routinely (Roots) Excuse (Exponents) My (Multiplication) Dear (Division) Aunt (Addition) Sally (Subtraction) Confidential - Kenneth R. Martin


Download ppt "Introductory Statistics and Data Analysis"

Similar presentations


Ads by Google