Presentation is loading. Please wait.

Presentation is loading. Please wait.

AS 91581 Achievement Standard.

Similar presentations


Presentation on theme: "AS 91581 Achievement Standard."— Presentation transcript:

1 AS 91581 Achievement Standard

2 PPDAC cycle

3 AS A M E 3.9 One of the principles:
Investigate bivariate measurement data Investigate bivariate measurement data, with justification Investigate bivariate measurement data, with statistical insight One of the principles: Grade distinctions should not be based on the candidate being required to acquire and retain more subject-specific knowledge.

4 Assessment is based on the PPDAC
Pose a question Plan Data Analyse Conclude Back to question Assessment is based on the PPDAC

5 Statistical enquiry cycle (PPDAC)
The Statistical Enquiry Cycle (PPDAC)

6 PPDAC is vital

7 Using the statistical enquiry cycle to …
investigate bivariate measurement data involves: posing an appropriate relationship question using a given multivariate data set selecting and using appropriate displays identifying features in data finding an appropriate model describing the nature and strength of the relationship and relating this to the context using the model to make a prediction communicating findings in a conclusion From Explanatory Note 3

8 Using the statistical enquiry cycle to …
investigate bivariate measurement data involves: posing an appropriate relationship question using a given multivariate data set selecting and using appropriate displays identifying features in data finding an appropriate model describing the nature and strength of the relationship and relating this to the context using the model to make a prediction communicating findings in a conclusion

9 Posing relationship questions
Possibly the most important component of the investigation Time spent on this component can determine the overall quality of the investigation This component provides an opportunity to show justification (M) and statistical insight (E)

10 Posing relationship questions
What makes a good relationship question? It is written as a question. It is written as a relationship question. It can be answered with the data available. The variables of interest are specified. It is a question whose answer is useful or interesting. The question is related to the purpose of the task. Think about the population of interest. Can the results be extended to a wider population?

11 Data

12 We need to understand the variables before we can make any progress.
WHOA!

13

14 Consider the variables (using context))

15 Think about which variables could be related JUSTIFY YOUR REASONING

16 Developing question posing skills
Pose several relationship questions (written with reasons/justifications) Possibly critique the questions The precise meaning of some variables may need to be researched

17 Different relationship questions
Is there a relationship between variable 1 and variable 2 for Hector’s dolphins? What is the nature of the relationship between variable 1 and variable 2 for Hector’s dolphins? Can variable 1 be used to predict variable 2 for Hector’s dolphins?

18 Research about the situation helps develop ideas
We might like to ask ourselves why we would want to know about any relationships that might exist” Research about the situation helps develop ideas

19 Reference to research

20 HINT: Groupings

21 A morphological study of skull and mandible features was undertaken to examine variation between the most genetically distinct population, occurring on the west coast of the North Island, and the populations around the South Island. Univariate and principal component analyses demonstrate that the North Island population can be differentiated from the southern populations on the basis of several skeletal characters.

22 Developing question posing skills
It is expected that you do some research: Improve knowledge of variables and context May find some related studies that creates potential for integration of statistical and contextual knowledge

23 Developing question posing skills
Draw some scatter plots to start to investigate your questions Reduce, add to and/or prioritise their list of questions Possibly critique the questions again

24 Consider the variables (using context))

25 Appropriate displays Which variable goes on the x-axis and which goes on the y-axis? It depends on the question and on the variables of interest Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins?

26 Variables on axes Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins?”

27 This is a straight forward relationship question.
Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins?”

28 This is a ‘correlation’ type question.
Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins?”

29 There is no difference in the roles of each variable so it does not matter which variable goes on the x-axis and which goes on the y-axis. For this question it does not matter which variable goes on each axis. Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins?”

30 Which variable goes on the x-axis and which goes on the y-axis?
It depends on the question and on the variables of interest Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins? Is there a relationship between rostrum width at midlength and rostrum width at base for Hector’s dolphins?

31 Variables on axes Is there a relationship between rostrum width at midlength and rostrum width at the base for Hector’s dolphins?

32 “I think that the width at the base could help determine (or influence) the width at midlength.”
Is there a relationship between rostrum width at midlength and rostrum width at the base for Hector’s dolphins?

33 Rostrum length at base (RWB) the explanatory variable and rostrum width at midlength the response variable.. Is there a relationship between rostrum width at midlength and rostrum width at the base for Hector’s dolphins?

34 Which variable goes on the x-axis and which goes on the y-axis?
It depends on the question and on the variables of interest Is there a relationship between zygomatic width and rostrum length for Hector’s dolphins? Is there a relationship between rostrum width at midlength and rostrum width at the base for Hector’s dolphins? For Hector’s dolphins, can rostrum length be used to predict mandible length?

35 Variables on axes For Hector’s dolphins, can rostrum length be used to predict mandible length?

36 This is a regression question
For Hector’s dolphins, can rostrum length be used to predict mandible length?

37 It is clear that we are interested in how mandible length responds to rostrum length, so rostrum length is the explanatory variable and mandible length is the response variable.

38 Experiments: If the data comes from an experiment then some variables would be classed as input variables and others as output variables. It would make sense to see how an input variable affects an output variable. The input variable would be the explanatory variable and the output variable would be the response variable.

39 Using the statistical enquiry cycle to …
investigate bivariate measurement data involves: posing an appropriate relationship question using a given multivariate data set selecting and using appropriate displays identifying features in data finding an appropriate model describing the nature and strength of the relationship and relating this to the context using the model to make a prediction communicating findings in a conclusion

40 Same approach whether linear or non-linear
investigate bivariate measurement data involves: posing an appropriate relationship question using a given multivariate data set selecting and using appropriate displays identifying features in data finding an appropriate model describing the nature and strength of the relationship and relating this to the context using the model to make a prediction communicating findings in a conclusion

41 Features, model, nature and strength
Generate the scatter plot using iNZight (or excel) RWM vs RWB Use iNZight Do RWM versus RWB in iNZight

42 Features, model, nature and strength
Generate the scatter plot Let the data speak Use your eyes (visual aspects) Write about what you see, not what you don’t see DON’T fit a model yet

43 Features, model, nature and strength
Template for features (but allow flexibility) Trend Association (nature) Strength (degree of scatter) Groupings/clusters Unusual observations Other (e.g., variation in scatter) If there are groupings evident in the plot then it may be better to explore the groups separately at this stage.

44 Trend From the scatter plot it appears that there is a linear trend between rostrum width at base and rostrum width at midlength. Use descriptions of variables rather than variable names- keeps it contextual. This is a reasonable expectation because two different measures on the same body part of an animal could be in proportion to each other. Key point is linear or non-linear. Use of descriptions of variables rather than variable names. Refer to the graph. Last paragraph: This illustrates some reflection.

45 Association The scatter plot also shows that as the rostrum width at base increases the rostrum width at midlength tends to increase. This is to be expected because dolphins with small rostrums would tend to have small values for rostrums widths at base and midlength and dolphins with large rostrums would tend to have large values for rostrums widths at base and midlength. A contextual description is preferable to one using technical terms. However it is appropriate to use terms such as positive, negative or no association, but they are better used after the contextual description. “Tends to” is a good term to use. Higher level considerations You should reflect on the nature of the relationship with respect to the context. At this stage you could acknowledge (if the data does not come from a randomised experiment) that they have found only a statistical relationship and that this does not necessarily imply a causal relationship between the variables. Alternatively, if the data comes from a suitable experiment they could make a causation claim in their conclusion. Students may acknowledge that other variables (which they must name) would impact on the (response) variable, and suggest how they might impact on the variable. For example, gender, age, etc.

46 Association The scatter plot also shows that as the rostrum width at base increases the rostrum width at midlength tends to increase. A contextual description is preferable to one using technical terms. However it is appropriate to use terms such as positive, negative or no association, but they are better used after the contextual description.

47 Higher level considerations
This is to be expected because dolphins with small rostrums would tend to have small values for rostrums widths at base and midlength and dolphins with large rostrums would tend to have large values for rostrums widths at base and midlength. You should reflect on the nature of the relationship with respect to the context. At this stage you could acknowledge (if the data does not come from a randomised experiment) that they have found only a statistical relationship and that this does not necessarily imply a causal relationship between the variables.

48 Higher level considerations
Alternatively, if the data comes from a suitable experiment you could make a causation claim in your conclusion. You may acknowledge that other variables (which they must name) would impact on the (response) variable, and suggest how they might impact on the variable. For example, gender, age, etc. and perhaps show these and compare.

49 Find a model Because the trend is linear I will fit a linear model to the data. You must state why you have selected this particular model. The line is a good model for the data because for all values of rostrum width at base, the number of points above the line are about the same as the number below it.

50 Find a model Because the trend is linear I will fit a linear model to the data. You must state why you have selected this particular model. Don’t show the equation yet. There are still features in the data to comment on.

51 Find a model Because the trend is linear I will fit a linear model to the data. You must state why you have selected this particular model. A discussion of fit throughout the range of x-values is required. If the number of observations is small that casts some doubt on the reliability of the model.

52 Find a model The line is a good model for the data because for all values of rostrum width at base, the number of points above the line are about the same as the number below it. A discussion of fit throughout the range of x-values is required. If the number of observations is small that casts some doubt on the reliability of the model.

53 Strength The points on the graph are reasonably close to the fitted line so the relationship between rostrum width at midlength and rostrum width at base is reasonably strong. This is supported by the correlation coefficient r = This must refer to visual aspects of the display. Appropriate descriptors are: strong, moderate or weak.

54 Strength The points on the graph are reasonably close to the fitted line so the relationship between rostrum width at midlength and rostrum width at base is reasonably strong. This is supported by the correlation coefficient r = You must refer to the degree of scatter about the trend or, equivalently, the closeness of the points to the trend.

55 Groupings None are apparent from the scatter plot.

56 If groupings are apparent then can a reason be found that explains the groupings?
None are apparent from the scatter plot.

57 But the data set has an obvious grouping variable (Island).
None are apparent from the scatter plot.

58 INZight

59 Groupings

60 Groupings Now it seems that NI Hector’s dolphins seem to have larger rostrum widths at base than SI ones. This is now helpful for doing predictions. There are some common values of RWB for both species of dolphin. This allows more comments to be made about the model fitted to all data points.

61 Groupings Now it seems that NI Hector’s dolphins seem to have larger rostrum widths at base than SI ones. Many NI points are above the line. Why is that? It could be the effect on the fitted line of the unusual point (RWB = 86, RWM = 50). Use iNZight to look at NI and SI dolphins separately (Subset by Island)

62 Groupings

63 This now provides an opportunity to comment on these models.

64 South Island dolphins: Model seems appropriate.

65 North Island: Only 13 data points
– model may not be useful for reliable conclusions.

66 Points with lower RWB values tend to be below the line, points in the middle tend to be above.
Could try a quadratic model but small number of data points is an issue

67 Unusual points One dolphin, one of those with a rostrum width at base of 86mm, had a smaller rostrum width at midlength compared to dolphins with the same, or similar, rostrum widths at base. Refer to actual data points. If not discussed earlier, comment on the effect any unusual values would have on the model.

68 Unusual points One dolphin, one of those with a rostrum width at base of 86mm, had a smaller rostrum width at midlength compared to dolphins with the same, or similar, rostrum widths at base. Refer to actual data points.

69 Unusual points One dolphin, one of those with a rostrum width at base of 86mm, had a smaller rostrum width at midlength compared to dolphins with the same, or similar, rostrum widths at base. Comment on the effect any unusual values would have on the model.

70 Anything else Variation in scatter?

71 Anything else Variation in scatter?
Constant versus non-constant scatter relates to assumptions of the linear regression i.e. that the residuals are normally distributed.

72 Anything else Variation in scatter?
Variation in scatter is relevant when discussing precision of predictions. Variation in scatter is more cognitively demanding because it involves “distribution reasoning”. Don’t be distracted by scarcity of data. This must refer to visual aspects of the display. Students should try to keep this contextual. For example: As the values of x increase the amount (or degree) of variation in y tends to increase (for fanning out)

73 Anything else Variation in scatter?
Refer to visual aspects of the display and keep it contextual. For example: As the values of x increase the amount (or degree) of variation in y tends to increase (for fanning out) Don’t be distracted by scarcity of data.

74 Prediction

75 Try to use relevant values of x
Try to use relevant values of x. In this case any RWB value from 81mm to 86mm is sensible.

76 Prediction Linear Trend RWM = 0.77 * RWB + -8.72
Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB

77 Use all three models with RWB = 85mm (all, NI only, SI only)
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB Using RWB = 85mm All points: RWM = 0.77 x 85 – 8.72 = 56.73 NI dolphins: RWM = 0.48 x = 59.99 SI dolphins: RWM = 0.46 x = 53.47

78 Interpret in context, with units.
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB

79 Use sensible rounding. Using RWB = 85mm
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB Using RWB = 85mm All points: RWM = 0.77 x 85 – 8.72 = 56.73 NI dolphins: RWM = 0.48 x = 59.99 SI dolphins: RWM = 0.46 x = 53.47

80 Dangers – predicting outside the range of observed x-values.
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB

81 How good is a prediction?
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB A prediction is an estimate so I need to consider bias and precision.

82 How good is a prediction?
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB Bias: If a model is good, then a prediction is likely to be accurate

83 How good is a prediction?
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB The number of data points used to form the model is relevant.

84 How good is a prediction?
Linear Trend RWM = 0.77 * RWB Summary for Island = 1 Linear Trend RWM = 0.48 * RWB Summary for Island = 2 Linear Trend RWM = 0.46 * RWB Precision – Relates to the degree of scatter Don’t relate a prediction to observed y- values.

85 Statistical enquiry cycle (PPDAC)

86 Communicating findings in a conclusion
Each component of the cycle must be communicated The question(s) must be answered

87 Summary Basic principles Each component Context Visual aspects
Higher level considerations Justify Extend Reflect

88 Other issues (if time) The use or articles or reports to assist contextual understanding How to develop understanding of outliers on a model The place of residuals and residual plots Is there a place for transforming variables?


Download ppt "AS 91581 Achievement Standard."

Similar presentations


Ads by Google