Presentation is loading. Please wait.

Presentation is loading. Please wait.

Module 5: Basic Concepts of Measurement. Module 5 focuses on concepts and terminology that will be helpful as you administer and interpret tests and other.

Similar presentations


Presentation on theme: "Module 5: Basic Concepts of Measurement. Module 5 focuses on concepts and terminology that will be helpful as you administer and interpret tests and other."— Presentation transcript:

1 Module 5: Basic Concepts of Measurement

2 Module 5 focuses on concepts and terminology that will be helpful as you administer and interpret tests and other standardized measures. Understanding these concepts and terms will assist teachers and childcare workers in clearly and accurately communicating information gathered from assessment to parents and other early childhood education stakeholders. By the conclusion of Module 5, you will understand basic test and measurement concepts as means for interpreting test results. Reading for Module 5: Chapter 4- Using Basic Concepts of Measurement

3 Importance of Measurement Concepts Teachers administer, collect, organize, interpret, evaluate and report assessment data. Familiarity with measurement terms and concepts facilitates teachers’ ability to understand how to interpret and evaluate test results. Understanding measurement concepts also helps teachers make use of data collected from non-test assessment methods (i.e., inventories, authentic assessments, etc.).

4 Measurement Terminology – Raw Scores The raw score is the number of items a child answers correctly on a test. Raw scores can be obtained on both teacher-made and standardized tests. Raw scores provide limited information—they do not allow for comparison between children’s performances. Raw scores must be converted into a form that allows for comparison.

5 Measurement Terminology - Mean The arithmetic average or mean is one way to convert raw scores into measures that allow for comparison among children’s performances. The mean is equal to the sum of all scores divided by the number of scores. Example: Set of raw scores: 60, 62, 65, 67, 72 Mean = 60 + 62 + 65 + 67 + 72 = 320/5 = 64 You can now interpret a child’s performance relative to the average performance of the group.

6 Measurement Terminology - Range It is helpful to know the range of scores on a test. The range gives you an idea of the spread of scores. The range is the difference between the between the highest and lowest score. The range is calculated by subtracting the lowest score from the highest score. Example: Set of raw scores: 60, 62, 65, 67, 72 High score – low score = Range 72 – 60 = 12 The range is also helpful in determining additional information about scores from tests given to children.

7 Measurement Terminology - Range What is the mean of the set of raw scores below? What is the range of the scores? Raw Scores: 80, 72 84, 95, 63, 62, 88, 74, 78, 64 Mean = Range = Would you prefer to teach in a situation where there was a wide range or narrow range of scores?

8 Measurement Terminology - Range The mean for the raw scores on the previous slide is 76. The range is 33! This range suggests there is much diversity in performance among children in this class. Planning for groups where a wide range in performance exists requires a diversity of materials and exercises in order to address the learning needs of all children. More information however can be provided about scores from tests by the standard deviation.

9 Measurement Terminology- Standard Deviation The standard deviation is a measure of the distance of scores from the mean ( mean = ). The normal curve represents a hypothetical distribution of scores if the test was taken by every child of the same age or grade in the population for which the test was designed.

10 Measurement Terminology- Standard Deviation As you can see from the normal curve, most students, approximately 68 %, score closest to the mean — 34% score below the mean and 34 % score above the mean. These scores are + 1 standard deviation from the mean. The fewest number of students, approximately 4%, score the farthest away from the mean— 2% above the mean and 2% below the mean. These scores are +3 standard deviations from the mean. The normal curve presents the typical spread of scores among the children in your class. The standard deviation presents a clearer picture of how scores compare.

11 Standardized Tests Standardized tests are increasingly being administered in early childhood educational settings. Standardized tests: Eliminate bias in the assessment of individual children; Allow for comparison among groups; Are based on knowledge and skills embedded in state and national standards.

12 Standardized Tests Norm-referenced tests Norm-referenced tests compare the performance of individual students to the performance of other children who take the same test. Criterion-referenced tests compare a child’s performance to their progress along identified skills or behaviors. Criteria are established and teachers must assess or observed the child’s performance of specific criterion. Standardized tests are administered, scored, and interpreted in a predetermined way. There are two types of standardized tests.

13 Standardized Tests Standardized test developers are guided by test development standards developed by the American Educational Research Association (AERA), American Psychological Association (APA), and the National Council on Measurement in Education (NCME). As a result, in the process of developing standardized tests, all test developers: Determine the rationale and purpose for the test; Explain what the test will measure; Determine who will be tested; Work toward absence of bias (assure that the test is not offensive or unfair to certain groups of children); Explain how the test results will be used.

14 Standardized Tests When using standardized tests, you should make sure you Match the test to the question(s) you want answered; Use the test for the purpose for which it was designed; Choose a test that is valid and reliable; Follow the directions for administering the test exactly as they are outlined in the test manual; Understand the report and statistics generated by the test.

15 Normative Samples Understanding the normative sample used by developers to standardize a test helps you determine if your children were included in standardization process. Developers use a norming sample in the standardization process. Samples are taken from populations. A population is an entire group of individuals having at least one characteristic in common. Tests are standardized for a particular group of individuals (e.g., kindergarteners, preschoolers, etc.).

16 Normative Samples Since developers can’t give a test to all members of a population during the development phase, they administer a test to a representative sample. PopulationSample

17 Normative Samples Characteristics of the population must be represented in the same proportion in the sample as they are in the population. For example: A sample of kindergarteners might have representation as follows: Hispanic American 14% Asian American 5% African American 12% White 69%

18 Norming All standardized tests are subjected to norming, which is the process used to determine how most children from the population for which the test was designed will score. Select normative sample. Administer the test to all members of the sample. Determine how children from the sample score on the test.

19 Norms Norms are the scores obtained from testing the normative sample. Norms are influenced by: The representativeness of the normative sample The number of individuals in the sample

20 Test Scores Test scores provide a snapshot or a sample of children’s behavior on the day the test was given. Test scores allow teachers to evaluate the difference between behavior reflected by the test score, and expected behavior, given the child’s age and grade level. This information is very important as teachers plan appropriate learning activities for the child..

21 Derived Test Scores Derived test scores are performance scores obtained when raw test scores of the individual child are compared to scores generated from the norming sample. Derived test scores are located in test manuals for specific tests and may take the form of developmental scores, percentiles or standard scores.

22 Developmental Scores Compares a child’s performance to that expected of the average child of that age. A child who scores 4-3 has the performance level of the average four year and three month old. Age Equivalent Scores Compares a child’s performance to that expected for his/her grade level. A first grader who scores 1.7 on a reading test performed at the level expected of the average child in the seventh month of first grade. Grade Equivalent Scores

23 Percentile Ranks Percentile ranks are derived scores that show the percentage of children who fall above and below a given raw score. Percentile ranks are based on the percent of people who scored the same number of correct answers and not the percentage of correct answers. A child at the 75 th percentile scores as well or better than 75% of the children in the norming sample.

24 Standard Scores Standard scores are derived scores that have been changed so that the means and standard deviations have predetermined values. Deviation IQ is an example of a standard score. The mean has been established as 100 The standard deviation is set at + 15

25 Standard Scores Normal Curve Equivalents The normal curve has been divided into 100 equal intervals with a mean of 50. The standard deviation has been established as + 21.

26 Standard Scores Stanines ( standard nines) Distributions are divided into nine parts The middle or fifth stanine is + 25 points below/above the mean. The second, third and fourth stanines are.5 standard deviations below the mean and the sixth, seventh, and eighth stanines are.5 standard deviations above the mean. The first stanine is 1.75 standard deviations below the mean and the ninth stanine is 1.75 deviations above the mean. The first, second and third stanines are generally below average; the fourth, fifth and sixth stanines average, and the seventh, eighth and ninth stanines above average. Stanines are the least precise of the standard scores discussed.

27 Standard Scores Stanines and deviation quotients are routinely used by early childhood educators. Parents may be able to understand stanines more easily than the other standard scores discussed, because performance is represented by a single digit.

28 Reliability and Validity Reliability Reliability refers to the consistency, dependability, and stability of a test. Validity Validity refers to the extent to which a test measures what it is suppose to measure.

29 Reliability Determining Reliability Test-Retest Reliability The same test is given to the same group after a period of time elapses. Students should have similar results the second time the test is taken. Alternate Reliability Two separate tests addressing the same material are given to the same group. Students should have similar scores on both tests. Split-half Reliability The same test is administered to the same group, only split in half -the first half of test takers' responses are correlated with their responses on the second half of the test. This process serves as a measure of internal consistency. Inter rater Reliability The extent to which two different testers obtain the same results when using the same test. Test results should not be influenced by who administers the test.

30 Reliability and Correlation Coefficients Reliability is reported as correlation coefficients in test manuals. A correlation is the term used to describe the relationship between two or more variables. The correlation coefficient is the number that tells us the degree of correlation between variables. Two items may have a strong positive relationship, a strong negative relationship, or no relationship at all. Most correlation coefficients reported in test manuals are positive. The higher the correlation coefficient, the greater the reliability.

31 Measures of Relationship Correlation Coefficient 1.00 = perfect correlation This means that the increase in one variable is associated with the increase in the other variable. Music Participation Memory -1.00 = perfect negative correlation This means that the increase in one variable is associated with the decrease in the other variable. Watching TV Creativity

32 Measures of Relationship Correlation Coefficient 0. 0 = no correlation, meaning that one variable is not associated with the other.. 20 -.40 Low Correlation. 20 -.40 Low Correlation. 40 -.60 Moderate Correlation. 40 -.60 Moderate Correlation.60 -.80 High correlation.60 -.80 High correlation.80 – 1.00 Very high – perfect correlation.80 – 1.00 Very high – perfect correlation.00 -.20 Negligible to Low Correlation.00 -.20 Negligible to Low Correlation Interpretation of Correlation Coefficients

33 Factors Affecting Reliability Standard Error of Measure (SEM) - determines the extent to which a score is due to chance. The larger the SEM, the less reliable the test. The longer the test—the more reliable it tends to be. Tests with age ranges should have sufficient test items for each age level. If not, the test may be less reliable for some age groups within the age range of the test. The shorter the time interval between two administrations of the same test, the higher will be the reliability coefficient. The larger the norming sample, the more reliable the test. The wider the range of scores, the more reliably the test is at distinguishing the spread of scores—which makes for a more reliable test.

34 Validity Types of Validity Face ValidityA test looks as though it tests what it is suppose to test. Content ValidityA test tests the subject matter/content it is suppose to test. Criterion-related Validity A relationship exists between scores on a teas and another criterion measure that is valid. Concurrent ValidityWhen two test are taken at the same time, the validity of one test can be established by the other test if test scores relate to each other, and the validity of at least one of the tests has already been established. Predictive ValidityThe extent to which a test score can estimate performance on a future test or criterion. Construct ValidityThe extent to which a test measures a theoretical construct—such as intelligence. Convergent ValiditySimilar tests measure similar constructs with comparable results. Discriminate ValidityThe extent to which a measure does not correlate with other constructs from which it's supposed to differ Social ValidityRefers to the usefulness of an assessment information for teachers in educational settings.

35 Factors Affecting Validity A valid test must be a reliable test. Tests may not be valid for children who are distractible, fail to understand test instructions or who are uncooperative. Issues affecting the child being tested (e.g., anxiety, motivation, degree of bilingualism) may affect the validity of the test.

36 Evaluating Tests Tests should be evaluated for technical adequacy (See Boxes 4.4, 4.5, & 4.6 –pgs 98- 99) and appropriateness. Sources to check: NAEYC position on Early Childhood Curriculum, Assessment, and Program Evaluation NAYEC Code of Ethical Conduct and Statement of Commitment ICLD Clinical Practices Guide Mental Measurements Handbook Appendix C for information on specific standardized and diagnostic tests

37 What Next?  Review Section V of the Early Childhood Assessment Study Guide. Can you explain each of the concepts and terms listed?  Reflect on the value of standardized tests. What is the primary value of standardized tests in a comprehensive assessment system?  Connect with a parent of an early learner. What has been their experience with standardized tests? Did they find the test information useful?  Connect with an early childhood educator. How have they used standardized test information for the children they teach?


Download ppt "Module 5: Basic Concepts of Measurement. Module 5 focuses on concepts and terminology that will be helpful as you administer and interpret tests and other."

Similar presentations


Ads by Google