Presentation is loading. Please wait.

Presentation is loading. Please wait.

MGMT 276: Statistical Inference in Management Fall, 2014 Green sheets.

Similar presentations


Presentation on theme: "MGMT 276: Statistical Inference in Management Fall, 2014 Green sheets."— Presentation transcript:

1

2 MGMT 276: Statistical Inference in Management Fall, 2014 Green sheets

3 By the end of lecture today 9/11/14 Use this as your study guide Questionnaire design and evaluation Surveys and questionnaire design Random versus non-random sampling techniques Correlational methodology Dot Plots Frequency Distributions - Frequency Histograms Frequency, cumulative frequency Relative frequency, cumulative relative frequency Guidelines for constructing frequency distributions

4 Just a reminder Talking or whispering to your neighbor can be a problem for us – please consider writing short notes. Complete this soon and receive extra credit! (By September 16 th 2014) A note on doodling

5

6 Homework due - (September 16 th ) On class website: please print and complete homework worksheet # 2 & 3 We’ll be using this for a writing assignment on Tuesday

7 Schedule of readings Before next exam: Please read chapters 1 - 4 & Appendix D & E in Lind Please read Chapters 1, 5, 6 and 13 in Plous Chapter 1: Selective Perception Chapter 5: Plasticity Chapter 6: Effects of Question Wording and Framing Chapter 13: Anchoring and Adjustment

8 Questionnaire is a set of fixed-format, self-report items completed without supervision or time-constraint Response rate and power of random sampling Response rate and power of random sampling Number of responders versus percentage of responders Number of responders versus percentage of responders Wording, order, balance can all affect results Wording, order, balance can all affect results Really important regarding bias ! Questionnaires use self-report items for measuring constructs. Constructs are operationally defined by content of items. Review

9 Questionnaire is a set of fixed-format, self-report items completed without supervision or time-constraint Response rate and power of random sampling Response rate and power of random sampling Number of responders versus percentage of responders Number of responders versus percentage of responders Wording, order, balance can all affect results Wording, order, balance can all affect results Really important regarding bias ! Questionnaires use self-report items for measuring constructs. Constructs are operationally defined by content of items. Review

10 As “composers” of questionnaire data – how do we design best possible product? - Iterative design process - pilot – fix - pilot – analyze – fix - pilot – all the way through your design As “consumers” of questionnaire data – what should we ask? Number of responders versus percentage of responders Operational definitions of constructs Wording Methodology of sampling Questionnaires use self-report items for measuring constructs. Constructs are operationally defined by content of items. Review

11 Preview of Questionnaire Homework There are four parts: Statement of Objectives Questionnaire itself (which is the operational definitions of the objectives) Data collection and creation of database Creation of graphs representing results

12 QuestionnaireHomework Objectives: This study will examine some of the subject characteristics that predict whether an individual is likely to prefer modern music characterized by amplified guitar rock and roll sounds or whether they would prefer older styles of music characterized by acoustic, orchestral (like cello) classical sounds. We will examine whether gender and age are associated with musical preference.

13 QuestionnaireHomework

14 QuestionnaireHomework

15 QuestionnaireHomework What might you graph?

16 QuestionnaireHomework

17 QuestionnaireHomework

18 QuestionnaireHomework Average of these three scores

19 QuestionnaireHomework Average of these two scores

20 QuestionnaireHomework Variable label and scale values

21 QuestionnaireHomework Average of these three scores

22 QuestionnaireHomework Average of these two scores

23 QuestionnaireHomework Variable label and scale values

24 QuestionnaireHomework

25 Iterative design process Peer review is an important skill in nearly all areas of business and science. Please strive to provide productive, useful and kind feedback as you complete your peer review

26 Sample versus census How is a census different from a sample? Census measures each person in the specific population Sample measures a subset of the population and infers about the population – representative sample is good What’s better? Use of existing survey data U.S. Census Family size, fertility, occupation The General Social Survey Surveys sample of US citizens over 1,000 items Same questions asked each year You’ve completed constructing your questionnaire…what’s the best way to get responders??

27 Parameter – Measurement or characteristic of the population Usually unknown (only estimated) Usually represented by Greek letters (µ) Population (census) versus sample Parameter versus statistic pronounced “mu ” pronounced “mew ” Statistic – Numerical value calculated from a sample Usually represented by Roman letters (x) pronounced “x bar ”

28 Simple random sampling: each person from the population has an equal probability of being included Sample frame = how you define population Sample frame = how you define population =RANDBETWEEN(1,115) Let’s take a sample …a random sample Question: Average weight of U of A football player Sample frame population of the U of A football team Or, you can use excel to provide number for random sample Random number table – List of random numbers Random number table – List of random numbers 64 Pick 64 th name on the list (64 is just an example here) Pick 24 th name on the list

29 Systematic random sampling: A probability sampling technique that involves selecting every technique that involves selecting every kth person from a sampling frame Other examples of systematic random sampling 1) check every 2000 th light bulb 2) survey every 10 th voter You pick the number

30 Stratified sampling: sampling technique that involves dividing a sample into subgroups (or strata) and then selecting samples from each of these groups - sampling technique can maintain ratios for the different groups Average number of speeding tickets 17.7% of sample are Pre-business majors 4.6% of sample are Psychology majors 4.6% of sample are Psychology majors 2.8% of sample are Biology majors 2.8% of sample are Biology majors 2.4% of sample are Architecture majors 2.4% of sample are Architecture majors etc etc Average cost for text books for a semester 12% of sample is from California 7% of sample is from Texas 6% of sample is from Florida 6% from New York 4% from Illinois 4% from Ohio 4% from Pennsylvania 3% from Michigan etc

31 Cluster sampling: sampling technique divides a population sample into subgroups (or clusters) by region or physical space. Can either measure everyone or select samples for each cluster Textbook prices Southwest schools Southwest schools Midwest schools Midwest schools Northwest schools Northwest schools etc etc Average student income, survey by Old main area Old main area Near McClelland Around Main Gate etc Patient satisfaction for hospital 7 th floor (near maternity ward) 7 th floor (near maternity ward) 5 th floor (near physical rehab) 5 th floor (near physical rehab) 2 nd floor (near trauma center) 2 nd floor (near trauma center) etc etc

32 Snowball sampling: a non-random technique in which one or more members of a population are located and used to lead the researcher to other members of the population Used when we don’t have any other way of finding them - also vulnerable to biases Convenience sampling: sampling technique that involves sampling people nearby. A non-random sample and vulnerable to bias Judgment sampling: sampling technique that involves sampling people who an expert says would be useful. A non-random sample and vulnerable to bias Non-random sampling is vulnerable to bias

33 You’ve gathered your data…what’s the best way to display it??

34 141720252129 162527181613 112119242011 202816131714 14168171711 11141719248 16122592017 1114161822 1418231215 1013151111 Describing Data Visually 81114171924 81214172025 91215172025 101315172025 111316172027 111316172128 111416182129 1114161822 1114161823 1114161924 Lists of numbers too hard to see patterns Organizing numbers helps Graphical representation even more clear This is a dot plot

35 Describing Data Visually 81214171924 81214172025 91315172025 101315172025 111316172027 111316172128 111416182129 1114161822 1114161823 1114161924 Measuring the “frequency of occurrence” Then figure “frequency of occurrence” for the bins We’ve got to put these data into groups (“bins”)

36 Frequency distributions Frequency distributions an organized list of observations and their frequency of occurrence How many kids are in your family? What is the most common family size?

37 Another example: How many kids in your family? 3 4 8 2 2 1 4 1 14 2 Number of kids in family 1313 1414 2424 2828 214

38 Frequency distributions Crucial guidelines for constructing frequency distributions: 1. Classes should be mutually exclusive: Each observation should be represented only once (no overlap between classes) 2. Set of classes should be exhaustive: Should include all possible data values (no data points should fall outside range) Wrong 0 - 5 5 - 10 10 - 15 Correct 0 - 4 5 - 9 10 - 14 Correct 0 - under 5 5 - under 10 10 - under 15 How many kids are in your family? What is the most common family size? Number of kids in family 13 14 24 28 214 Wrong 0 - 3 4 - 7 8 - 11 Correct 0 - 3 4 - 7 8 - 11 12 - 15 No place for our family of 14!

39 Frequency distributions Crucial guidelines for constructing frequency distributions: 3. All classes should have equal intervals (even if the frequency for that class is zero) Wrong 0 - 1 2 - 10 11 - 19 Correct 0 - 4 5 - 9 10 - 14 Correct 0 - under 5 5 - under 10 10 - under 15 How many kids are in your family? What is the most common family size? Number of kids in family 13 14 24 28 214

40 4. Selecting number of classes is subjective Generally 5 -15 will often work 8 12 14 17 19 24 8 12 14 17 20 25 9 13 15 17 20 25 10 13 15 17 20 25 11 13 16 17 20 27 11 13 16 17 21 28 11 14 16 18 21 29 11 14 16 18 22 11 14 16 18 23 11 14 16 19 24 How about 6 classes? (“bins”) How about 8 classes? (“bins”) How about 16 classes? (“bins”)

41 5. Class width should be round (easy) numbers 6. Try to avoid open ended classes For example 10 and above Greater than 100 Less than 50 Clear & Easy 8 - 11 12 - 15 16 - 19 20 - 23 24 - 27 28 - 31 8 12 14 17 19 24 8 12 14 17 20 25 9 13 15 17 20 25 10 13 15 17 20 25 11 13 16 17 20 27 11 13 16 17 21 28 11 14 16 18 21 29 11 14 16 18 22 11 14 16 18 23 11 14 16 19 24 Round numbers: 5, 10, 15, 20 etc or 3, 6, 9, 12 etc Lower boundary can be multiple of interval size Remember: This is all about helping readers understand quickly and clearly.

42 Let’s do one Scores on an exam 82586480 75728773 88948478 93697060 53847687 84618995 87917599 If less than 10 groups, “ungrouped” is fine If more than 10 groups, “grouped” might be better How to figure how many values 99 - 53 + 1 = 47 Step 1: List scores 53 58 60 61 64 69 70 72 73 75 76 78 80 82 84 87 88 89 91 93 94 95 99 Step 2: List scores in order Step 3: Decide whether grouped or ungrouped Step 4: Generate number and size of intervals (or size of bins) Largest number - smallest number + 1 Sample size (n) 10 – 16 17 – 32 33 – 64 65 – 128 129 - 255 256 – 511 512 – 1,024 Number of classes 5 6 7 8 9 10 11 If we have 6 bins – we’d have intervals of 8 Whaddya think? Would intervals of 5 be easier to read? Let’s just try it and see which we prefer…

43 Scores on an exam 82586480 75728773 88948478 93697060 53847687 84618995 87917599 53 58 60 61 64 69 70 72 73 75 76 78 80 82 84 87 88 89 91 93 94 95 99 Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Scores on an exam Score Frequency 93 - 100 4 85 - 92 6 77- 84 6 69 - 76 7 61- 68 2 53 - 60 3 10 bins Interval of 5 6 bins Interval of 8 Let’s just try it and see which we prefer… Remember: This is all about helping readers understand quickly and clearly. Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1

44 Scores on an exam 82586480 75728773 88948478 93697060 53847687 84618995 87917599 Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Let’s make a frequency histogram using 10 bins and bin width of 5!!

45 Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Step 6: Complete the Frequency Table Scores on an exam 82 58 64 80 75 72 87 73 88 94 84 78 93 69 70 60 53 84 76 87 84 61 89 95 87 91 75 99 Cumulative Frequency 28 26 23 18 13 9 6 5 2 1 Relative Frequency.0715.1071.1786.1429.1071.0357.1071.0357 Relative Cumulative Frequency 1.0000.9285.8214.6428.4642.3213.2142.1785.0714.0357 6 bins Interval of 8 Just adding up the frequency data from the smallest to largest numbers Just dividing each frequency by total number to get a ratio (like a percent) Please note: 1 /28 =.0357 3/ 28 =.1071 4/28 =.1429 Just adding up the relative frequency data from the smallest to largest numbers Please note: Also just dividing cumulative frequency by total number 1/28 =.0357 2/28 =.0714 5/28 =.1786

46 Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Cumulative Frequency Data Scores on an exam 82 58 64 80 75 72 87 73 88 94 84 78 93 69 70 60 53 84 76 87 84 61 89 95 87 91 75 99 Cumulative Frequency 28 26 23 18 13 9 6 5 2 1 Relative Frequency.0715.1071.1786.1429.1071.0357.1071.0357 Cumulative Rel. Freq. 1.0000.9285.8214.6428.4642.3213.2142.1785.0714.0357 Cumulative Frequency Histogram Where are we?

47 Step 4: Decide 10 for # bins (classes) 5 for bin width (interval size) Scores on an exam 82586480 75728773 88948478 93697060 53847687 84618995 87917599 Step 1: List scores Step 2: List scores in order Step 3: Decide grouped Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Step 5: Generate frequency histogram Score on exam 80 - 84 75 - 79 70 - 74 65 - 69 60 - 64 55 - 59 50 - 54 90 - 94 95 - 99 85 - 89 6 5 4 3 2 1

48 Scores on an exam 82586480 75728773 88948478 93697060 53847687 84618995 87917599 Scores on an exam Score Frequency 95 - 992 90 - 94 3 85 - 89 5 80 – 845 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Score on exam 80 - 84 75 - 79 70 - 74 65 - 69 60 - 64 55 - 59 50 - 54 90 - 94 95 - 99 85 - 89 6 5 4 3 2 1 Generate frequency polygon Plot midpoint of histogram intervals Connect the midpoints

49 Scores on an exam 82586480 75728773 88948478 93697060 53847687 84618995 87917599 Scores on an exam Score 95 – 99 90 - 94 85 - 89 80 – 84 75 - 79 70 - 74 65 - 69 60 - 64 55 - 59 50 - 54 Score on exam 80 - 84 75 - 79 70 - 74 65 - 69 60 - 64 55 - 59 50 - 54 90 - 94 95 - 99 85 - 89 30 25 20 15 10 5 Frequency ogive is used for cumulative data Generate frequency ogive (“oh-jive”) Cumulative Frequency 28 26 23 18 13 9 6 5 2 1 Connect the midpoints Plot midpoint of histogram intervals

50 Pareto Chart: Categories are displayed in descending order of frequency

51 Stacked Bar Chart: Bar Height is the sum of several subtotals

52 Simple Line Charts: Often used for time series data (continuous data) (the space between data points implies a continuous flow) Note: Can use a two-scale chart with caution Note: Fewer grid lines can be more effective Note: For multiple variables lines can be better than bar graph

53 Pie Charts: General idea of data that must sum to a total (these are problematic and overly used – use with much caution) Bar Charts can often be more effective Exploded 3-D pie charts look cool but a simple 2-D chart may be more clear Exploded 3-D pie charts look cool but a simple 2-D chart may be more clear

54 Data based on Gallup poll on 8/24/11 Who is your favorite candidate Candidate Frequency Rick Perry29 Mitt Romney17 Ron Paul13 Michelle Bachman10 Herman Cain 4 Newt Gingrich 4 No preference23 Simple Frequency Table – Qualitative Data We asked 100 Republicans “Who is your favorite candidate?” Relative Frequency.2900.1700.1300.1000.0400.2300 Just divide each frequency by total number Please note: 29 /100 =.2900 17 /100 =.1700 13 /100 =.1300 4 /100 =.0400 Percent 29% 17% 13% 10% 4% 23% If 22 million Republicans voted today how many would vote for each candidate? Number expected to vote 6,380,000 3,740,000 2,860,000 2,200,000 880,000 5,060,000 Just multiply each relative frequency by 100 Please note:.2900 x 100 = 29%.1700 x 100 = 17%.1300 x 100 = 13%.0400 x 100 = 4% Just multiply each relative frequency by 22 million Please note:.2900 x 22m = 6,667k.1700 x 22m = 3,740k.1300 x 22m = 2,860k.0400 x 22m= 880k

55

56

57

58


Download ppt "MGMT 276: Statistical Inference in Management Fall, 2014 Green sheets."

Similar presentations


Ads by Google