Presentation is loading. Please wait.

Presentation is loading. Please wait.

Economic Perspectives on Standardized Testing(c)

Similar presentations

Presentation on theme: "Economic Perspectives on Standardized Testing(c)"— Presentation transcript:

1 Economic Perspectives on Standardized Testing(c)
Richard P. Phelps (c) 2002, by Richard P. Phelps

2 Economic Perspectives on Standardized Testing: Outline
Why can’t economists and psychologists just get along? Overview of economic theory as it pertains to education & testing Human capital theory and the economics of information Supply & demand; benefits & costs; goods & bads The cost of standardized testing (from society’s point of view) The benefits of standardized testing (information) The benefits of standardized testing (motivation) Optimal testing system structures Optimal testing industry structures Discussion

3 Topic 1: Why can’t economists and psychologists just get along?

4 1) Why can’t economists and psychologists just get along
1) Why can’t economists and psychologists just get along? [answer: sometimes they do] Tversky and Kahneman, two cognitive psychologists, asked themselves why rational economic man patronizes casinos, where the odds are against him. Their experiments revealed that tolerance of (or, attraction to) risk varies widely among individuals, and most weigh small risks against low-probability, but very large, gains “sub-optimally” Tversky’s and Kahneman’s work is now required reading for any economics major Experimental economics, which strongly resembles cognitive psychology in its methods, is now the fastest growing area of research in the field.

5 Decline in interest in Test Utility research
1) Why can’t economists and psychologists just get along? [answer: sometimes they do not] Test Utility research Thousands of studies conducted by I/O psychologists from the 1960s through the 1980s Dozens of meta-analyses Even a few meta-analyses of the meta-analyses Few economists, then or now, even aware of the field Decline in interest in Test Utility research Regulatory ruling against validity generalization in late 1980s by Civil Rights office in Reagan administration National Research Council forms committee with curious membership to critique a single Test Utility study (critique interpreted by many as a condemnation of all Test Utility research)

6 Topic 2: Overview of economic theory as it pertains to education & testing

7 2) Economic theory as it pertains to education in general
Traditionally, education economics conducted in 2 fields Labor Economics Labor markets for teachers and graduates Returns (in wages) to investment (in years) in education Public Finance Returns (in achievement, attainment) to investment (in tax revenues) Funding equity, adequacy, efficiency, & intra-metropolitan migration

8 2) Economic theory as it pertains to testing in particular
Human Capital Theory Higher wages over the long term can more than compensate for the earnings foregone while still in school …assumed a strong correlation between accumulation (years in school, any school) and earning power (applicable knowledge and skills) Economics of Information Basic economic assumption of “perfect information” is simplistic When buyer and seller have “asymmetric” information, classic economic assumptions are not appropriate

9 Topic 3: Human capital theory and the economics of information

10 3) Human capital theory: seminal works
Human Capital (1954), Gary Becker Schooling, Experience, and Earnings (1974), Jacob Mincer Dozens of World Bank reports

11 3) Economics of Information: seminal works
“The Market for Lemons” (1970) George Akerlof When buyers can evaluate a purchase based only on a quality assessment of the entire group, sellers have an incentive to market poor quality merchandise and, over time, the average quality of goods declines. Often-used counters to quality decline are: guarantees, brand names, franchising, and credentials. “Economics of Imperfect Information”(1976) Rothschild, Stiglitz, Grossman Perfectly competitive markets have perfect information. In markets without perfect information, there is little incentive for private individuals to fill the breach (Consumers’ Reports is an exception, and not very profitable). Thus, there can be a role for government to promote market efficiency, by providing information.

12 3) Screening, signaling, filtering, credentialing, I
Education and Jobs: The Great Training Robbery (1970), Ivar Berg Employers pay for credentials, not human capital; they know little to nothing of the quality of education programs, only the perception thereof Generating Inequality (1972) Lester Thurow Employers want “trainable” employees, and judge that those who could endure schooling are probably more trainable than those who could not Work of Piore and Doeringer on “Market Segmentation” Neither education nor education credentials matter in “secondary” labor markets, only in “primary” market, with career ladders

13 3) Screening, signaling, filtering, credentialing, II
Market Signaling (1973), Michael Spence Diplomas are a signaling device to employers, who take a gamble with every new hire; evidence that the graduate is hoping employers will conclude that certain human capital has been obtained, but not proof that it has “On the Weak versus the Strong Version of the Screening Hypothesis” (1979) George Psacharopoulos Weak: employers pay only higher starting wages for “better” credentials Strong: employers continue to pay higher wages for “better” credentials even after they become familiar with each employee’s actual productivity “Higher Education as a Filter” (1973) Kenneth Arrow “The Theory of Screening” (1975) Joseph Stiglitz

14 3) Empirical and theoretical work on standards
Burton Weisbrod (1964) Discovered that 90% of adults are hired within the boundaries of a school district other than the one from which they graduated So, employers are not familiar with and have no influence over the education standards used to train virtually all their employees John Bishop (1980s) It is unreasonable to expect a teacher to be both a sympathetic coach and a neutral judge. External exams let them be coaches exclusively, which is in keeping with what most of them probably want anyway. Robert Costrell (1994) School district incentives are to inflate grades and socially promote. If they maintain tough standards, they only hurt their own children in later competition against graduates of other districts where standards are lax and grades inflated. Standards must be enforced externally, or they will not be.

15 Topic 4: Supply & demand; benefits & costs; goods & bads

16 4) Benefits & costs; goods & bads
Economists are (small d) democrats what is a “good” or a benefit is relative to each individual; the researcher does not get to decide what is good or bad for the consumer; consumers decide for themselves but, we’d all like more money (freely exchangeable) and more free time Economists assume we all want more of something (even if it is spiritual enlightenment), and that we can’t always get it Benefits have two phases: creation and capture Not all potential benefits are realized, or “captured” (e.g.,) You do very well and learn very much at a college with a terrible reputation, and then cannot get a job because of that reputation

17 4) The demand for standardized testing
Phelps (1998) - 40 years of public opinion poll data The adult public is not ignorant about standardized tests, since all have taken many, for better or for worse Support for high-stakes standardized testing is overwhelming, and has been consistently so for decades Most stakeholders, including students and parents, are strongly supportive. Teachers are usually supportive, but don’t like being judged for outcomes over which they have little control. Education professors are strongly opposed. Administrators have been on the fence, may now be opposed. The year 2000 “testing backlash” was very strongly hyped public relations creature, and completely unsupported by the objective evidence.

18 4) “Natural Experiments” in test demand and valuation: a) countries liberalize education, b) drop test requirements, c) find that standards deteriorate, d) then revert back to testing Many Western European and North American states (1960s – 1970s) Many Post-Colonial, Newly-Independent states (1940s – 1970s) Ex-Communist Eastern European states (1990s – 2000s)

19 4) Trends in test adding/dropping, OECD countries: 1974--1999

20 4) Countries adding or dropping large-scale, external testing, by type of testing: 1974-1999
Number of countries or provinces...  Type of testing  ...adding testing ...dropping testing Assessments  17  0 Upper secondary exit exams  12* University entrance exams  5 Subject-area end-of-course exams  6 Lower secondary exit or entrance exams  4  2 Inclusion of voc/prof tracks in exit exam system  3 Primary/secondary-level achievement testing  1 Diagnostic testing TOTAL  51

21 Lower secondary school
4) Countries with nationally standardized high-stakes exit exams, by level of education Primary school Lower secondary school Upper secondary school Belgium (French) Italy Netherlands Russia Singapore Switzerland (some cantons) Canada: Quebec China Czech Republic Denmark France Hungary Iceland Ireland Japan Korea New Zealand Norway Portugal Sweden Switzerland United Kingdom: England & Wales, Scotland Belgium: (Flemish) & (French) Canada: Alberta, British Columbia, Manitoba, New Brunswick, Newfoundland, Quebec Finland Germany

22 4) Demand for testing is not unlimited – saturation is possible
School district response to state test mandates (1991) State and local tests' purpose and content are… Percent of districts substituting state test …exactly the same or very similar 82 …somewhat or moderately similar 69 …not at all similar or very little 41 SOURCE: U.S. GAO, 1993.

23 Topic 5: The Cost of Standardized Testing (from society’s point of view)

24 5) Cost jargon Marginal cost (the cost of the next unit): For a test, it is the cost that is incurred due to the addition of a test, and only that cost. (e.g., during test administration, the school building must be maintained, but such would be the case without a test, too. The test is not responsible for this cost.) Subject-matter instruction occurs whether or not there is external testing, so it also is not a cost of the test. Opportunity cost (cost of foregone opportunities (i.e., instead of doing this, you could have been at work making money)): For a test, the time a teacher spends preparing for, monitoring, or scoring a test is time he could have been planning his course, grading homework, etc. If the teacher makes productive use of the time while students are taking a test, there are no opportunity costs.

25 5) Average all-inclusive per-student costs of two test types in states having both: 1990-91
Type of test Cost factors Multiple-choice Performance Start-up development $2 $10 Ongoing, annual costs $16 $33 SOURCE: U.S. GAO, 1993, p.43

26 Sample of 6 multiple-choice tests in those same states
5) Average per-student costs of two test types in states having both, with adjustments:  All systemwide tests  Sample of 11 state performance tests  Sample of 6 multiple-choice tests in those same states  All-inclusive marginal cost $15 $33 $16  …minus adjustment for regular school year administration -7 -15  ...minus adjustment for replacement of preexisting tests -6 -12  Marginal cost after adjustments $5 $11 $2 SOURCE: Phelps, 2000.

27 5) “Economies” jargon The unit cost of producing your product declines the more of an “economy” you have (because fixed/overhead costs get spread out) Scale – you can sell at lower cost because you make so many of them Scope – you can sell at lower cost because you make other stuff that is similar, or in similar ways Learning – you figure out ways to be more efficient and productive as you gain experience There are many “economies” (just like validities)

28 Economies of scale in state performance testing

29 Some economies of scope in state performance testing

30 5) General structure of testing costs
Scorers are... GROUPS of teachers or professional scorers INDIVIDUAL teachers or professional scorers a COMPUTER Students take tests... EN MASSE in GROUPS ONE at a TIME

31 Playing or socializing
5) Slack capacity in U.S. students’ time = opportunity for windfall gain ? Average number of hours per day devoted to… Region/ Country Sports TV watching Playing or socializing Studying USA 2.2 2.6 2.5 2.3 East Asia (N = 5) 0.9 2.4 1.3 3.1 West Europe (N = 4) 1.6 2.0 2.8 East Europe (N = 7) 2.9

32 Topic 6: The Benefits of Standardized Testing -- Information

33 6) Information benefits of testing
For whom? Could be anyone – student, parent, teacher, school, public, postsecondary institution, employer, … Information can be used beneficially in: Diagnosis (of student, teacher, school, ….) Alignment (to standards, schedule, each other, …) Learning for teachers Goodwill with public Decisions (promotion, placement, selection, …)

34 6) Information benefits of testing – how are they measured?
Predictive validity (fairly measurable) Allocative efficiency (fairly measurable) (the greater the range restriction the higher the allocative efficiency?) Alignment (not so easy to measure) Goodwill (not at all easy to measure)

35 Topic 7: The Benefits of Standardized Testing -- Motivation

36 6) Motivational benefits of testing – how are they measured?
In controlled experiments: Ex. A) One group is told the test at the end of the course comes with a reward; control group told it does not count Ex. B) One group is tested throughout course; control group is not In large-scale studies--Graduates from regions with high-stakes tests compared to their non-tested counterparts: By their relative performance on another, common test Their relative wages after graduation Their relative rates of dropout, persistence, attainment, … “Backwash Effect” (e.g., students in states with high-stakes high school graduation tests perform better even on the 8th-grade level IAEP, TIMSS, or NAEP

37 7) Large-scale studies finding benefits to the use of external, high-stakes examinations
John Bishop (1980s+) several studies -- IAEP, TIMSS, SAT, NY State, Canada, … Winfield; Fredericksen; Bishop; Jacobson (minimum comp. states) Others: Graham, Husted (SAT); Grissmer, Flanagan (NAEP); Phelps (TIMSS+); Carnoy (NAEP); Rosenshine (NAEP); Braun (NAEP); Wenglinsky

38 7) Smaller-scale studies finding benefits to the use of high-stakes examinations
Controlled experiments – Tuckman, Trimble; Webb; Wolf, Smith; Egeland; Jones; Brown, Walberg; Tuckman; Khalaf, Hanna; others…. Evaluations -- Anderson, Muir, Bateson, Blackmore, Rogers; Heyneman; G.A.O.; Achieve; Stake, Theobald; Bond, Cohen; Calder; Glassnap, Pogio, Miller; others… Case studies – S.R.E.B.; Schleisman; Neville; Goldberg, Roswell; Schlawin; Delong; Lerner; Jett, Shafer; others…

39 7) Bishop's estimates of dollar value of high-stakes exams on student outcomes
Difference (in standard deviation units) Difference (in grade- level-equivalent units) Difference per student (in net present value) in 1993 dollars* Canada: High-stakes testing provinces vs. others .233 (in math) .183 (in science) .75 (in math) .67 (in science) $13,370 (in math) $11,940 (in science) USA: New York State vs rest of U.S. .164 (in SAT Verbal +Math) .75 (verbal + math) $13,370 IAEP: High-stakes testing countries vs. others .586 (in math) 2.0 (in math) .7 (in science) $35,650 (in math) $12,480 (in science) TIMSS: High-stakes testing countries vs. others n/a .9 (in math) 1.3 (in science) $16,040 (in math) $23,170 (in science) * Based on male-female average, averaged across six longitudinal studies, cited in Bishop, 1995a, Table 2, counting only general academic achievement, not accounting for technical abilities.

40 Topic 8: Optimal testing system structures

41 8) Single or multiple target systems
Becker and Rosen (1990) A “single target” examination (e.g., minimum competency) is problematic Set too high, slower kids will be discouraged and drop out Set too low, and advanced kids will be bored and may work less Examination systems should have multiple targets Empirical Studies of 1970s—1980s Minimum Competency Exams (e.g., Ligon, Mangino, Babcock Johnstone, Brightman, Davis) Performance of lowest students did improve, but that of advanced students either stayed flat, or decreased Jonathan Jacobson (1992) Longitudinal analysis of students from minimum competency states showed that slowest students gained and middle students lost Probably, the test induced resource flows to the slow students and away from the middle students

42 8) Examples of multiple target systems
Hierarchical, or “tiered,” systems – British system, New York State All students must pass exams with broad, common requirements, but at choice of levels (Advanced or Ordinary; Competency or Honors) British just recently changed, creating a hybrid that looks more like continental exam systems Branched or parallel track systems – Most of Continental Europe Students choose (or the choice is made for them) where to concentrate their efforts, and they are tested mostly on that concentration First branching (junior high level) into academic, general, vocational Second branching (high school level) into subject area or vocational concentration

43 8) Some current research on testing system structure
John Bishop Suspects that standardized end-of-course or end-of-year examinations may be the most optimal form of standardized testing. Why? – perhaps because they combine the best of both worlds standardized and external concise, targeted, with very strong alignment between curriculum and test Value-added systems Concerns for volatility and fairness mandate that the testing be frequent – at least annual Tests not only quality control measure; How to optimize whole set (Phelps, Just for the Kids, others…)

44 8) The more high-stakes decision points, the better the student performance ?
SOURCE: Phelps, 2001

45 8) Quality control has proportionally greater effect in poorer countries
SOURCE: Phelps, 2001

46 Topic 9: Optimal testing industry structures

47 9) The industry structure game, in theory
Selfish consumers want a perfectly competitive industry Lots of producers, cutthroat competition Easy producer entry to, exit from industry Low prices, lots of choice and information Selfish producers want to be monopolists Raise prices, lower quality Block new entrants, withhold information

48 9) The industry structure game, in practice
Consumers want stable suppliers, salespeople they know, brand names they can trust So, sure, they want competition, choice, and low prices… But, they do not want to have to try out a new brand of detergent after every visit to the grocery store Producers try to avoid monopoly, or else get regulated or split up e.g., Microsoft pushes Apple and Corel to the brink of bankruptcy, then tosses each of them a lifeline to keep them in business (barely) So, the goal is to approach having a monopoly without quite having one

49 9) Competitive strategy theory
In industries with steep economies (of scale, scope, learning, ….) there is only room for so many producers If you do not have the relevant “economies” in your firm, you had better focus on a specialty niche that makes you unique, or else get out (e.g.) General Electric/RCA Consumer Electronics (1987) Crowded field: Sony, Zenith, Phillips, Toshiba, Mitsubishi, others Sony - technological edge, reputation for quality, could charge high prices Niche players – Mitsubishi (big screen TVs); Sharp (flat panels) Low cost players – Koreans had entered market, Chinese were purchasing the facilities of bankrupt American firms (e.g., Admiral, Philco, Sylvania) Japanese manufacturers were building assembly plants in US and Mexico in order to lower their shipping costs for large sets GE was “stuck in the middle” – could not compete on cost or quality and had no unique niche – they sold out

50 9) Possible sources of competitive advantage in the testing industry
Advantages related to scale economies Huge item banks take time to accumulate and test and they are copyrighted (‘sunk costs’ => barrier to entry) Established client base, relationships Advantages related to scope economies Much psychometric expertise is equally useful across a variety of tests Customers needs largely similar across states, countries Good brand name provides instant cachet in new markets Advantages related to learning economies Experience working with, knowledge of clients Experience gained with a new type of product will lower cost for subsequent, similar projects

51 9) Niche markets in educational testing (where “economies” may be of little help)
Custom-made performance tests, “built from scratch” Some special education and psychological testing that requires one-on-one administration, highly-specialized protocols, or licensed test administrators Some vocational-occupational testing that employs “hands on” demonstrations observed by specialists Oral interviews

52 Topic 10: Discussion


Download ppt "Economic Perspectives on Standardized Testing(c)"

Similar presentations

Ads by Google