Presentation is loading. Please wait.

Presentation is loading. Please wait.

Doing Synthesis and Meta-Analysis in Applied Linguistics Lourdes Ortega University of Hawai‘i at M ā noa National Tsing Hua University Taiwan, June 8,

Similar presentations


Presentation on theme: "Doing Synthesis and Meta-Analysis in Applied Linguistics Lourdes Ortega University of Hawai‘i at M ā noa National Tsing Hua University Taiwan, June 8,"— Presentation transcript:

1 Doing Synthesis and Meta-Analysis in Applied Linguistics Lourdes Ortega University of Hawai‘i at M ā noa National Tsing Hua University Taiwan, June 8, 2011

2 Please cite as: Ortega, L. (2011). Doing synthesis and meta-analysis in applied linguistics. Invited workshop at Tsing Hua University, Taipei, June 8, 2011. Copyright © Lourdes Ortega, 2011

3 Research synthesis (including meta-analysis) 1.What is it? 2.Why do it? 3.How do we do it? 4.An example… 5.Challenges? 6.Value?

4 What is research synthesis?

5 The reviewing continuum S e c o n d a r y R e s e a r c h Narrative..............................................................Systematic ……………..SYNTHESIS……………LIT REVIEW META-ANALYSIS

6 So, what is meta-analysis, specifically? …one specific kind of research synthesis… Secondary analysis of quantitative analyses Each primary study is a data point Goal: what are the main ‘effects’ or ‘relationships’ found across many studies? Strictly speaking, only quantitative studies apply

7 Why do it?

8 …have lead to unending debates: What does the evidence “say”? According to whom? How do we know who is right? Traditional literature reviews…

9 e.g.: error correction (Ferris vs. Truscott) e.g.: Critical Period Hypothesis (Hyltenstam et al. vs. Birdsong)

10 Typical strategies of traditional reviews?

11 Tables summarizing many studies

12 e.g. from Krashen et al. (1979):

13 Vote-counting technique

14 e.g.: Error correction in L2 writing

15 Limitations: No specific set of methods, up to mysterious expertise Experts are always vested, therefore vulnerable to charge of bias Statistical significance has serious pitfalls Idiosyncratic methodology Evidentiary warrants difficult to judge Over-reliance on statistical significance (but magnitude, not just generalizability, is of interest to social scientists!)

16 What does the evidence “say”? According to whom? How do we know who is right?

17 Methods for reviewing, from “art” into “science”: Systematic, not arbitrary More than the sum of the parts Replicable SOLUTION in the late 1970s Secondary, yes... but empirically accountable, & discovering new truths in old data

18 How do we do it?

19 Norris & Ortega (2006a, 2006b)

20 Norris, J. M., & Ortega, L. (2010). Timeline: Research synthesis. Language Teaching, 43, 461-479. Ortega, L. (2010). Research synthesis. In B. Paltridge & A. Phakiti (Eds.), Companion to research methods in applied linguistics (pp. 111- 126). London: Continuum. Norris, J. M. (2012). Meta-analysis. In C. Chapelle (Ed.), Encyclopedia of applied linguistics. Malden, MA: Wiley. Norris, J. M., & Ortega, L. (2007). The future of research synthesis in applied linguistics: Beyond art or science. TESOL Quarterly, 41, 805-815.

21 1. Principled selection of primary studies 3. Direct use of the evidence reported (not the authors’ interpretations) across studies What are the definitional features of all syntheses (including meta-analyses)? 2. Systematic coding of each study for main variables

22 1. Principled selection of studies Sampling is central to empirical research  what population are we trying to understand? Random [experimental] Purposive [qualitative]

23 Sampling is central to synthesis, as well Complete [secondary research should be based on the full universe of studies that have investigated the same thing]

24 Search & Retrieval of Literature The literature search is a key step in systematic synthesis (some direction: In'nami & Koizumi, 2010)  identify all studies that are relevant Exhaustive [electronic, hand, footnote chasing invisible college] Replicable [fully explained in report]

25 1 st  electronic searches 2 nd  other techniques: Manual searches of journals Footnote chasing Forward searches with Web of Science Website searches of key contributing scholars Polite email requests to authors & experts

26 Inclusion & Exclusion criteria All potentially relevant studies must then be examined to decide: Include or Exclude (“apples or oranges?”) Inclusion criteria [all criteria satisfied] Exclusion criteria [explain each reason for exclusion and give examples] Full rationale: [tables, appendices, philosophy of inclusivity or selectivity]

27 1. Principled selection of studies Literature search + Study eligibility criteria, Inclusion/exclusion What are the definitional features of all syntheses (including meta-analyses)?

28 2. Systematic coding of each study Eliciting evidence with consistency, just as when surveying, interviewing, or testing participants  Asking research questions of the literature: What variables are important? How (and how well) have they been investigated? What are the findings across studies?

29 Publication features Substantive features Methodological features e.g., How was “explicit” instruction defined? e.g., How was “learning” measured? e.g., Means, sd, etc? Sample size Design Reliability Stats used Etc. Year Author Published or Fugitive? Journal Book Dissertation Presentation Coding book to identify study features that answer questions Multiple coders

30 1. Principled selection of studies 2. Systematic coding of each study for main variables Coding book, Standardization, Intercoder reliability What are the definitional features of all syntheses (including all meta-analyses)?

31 Record carefully what authors report and how they report it,… But ultimately, analyze what the evidence they present tells us, not what they say it means… Seeking an objective view across studies of the accumulated state of knowledge… 3. Trust the evidence, not the authors

32 When aggregating and averaging findings is the goal, as in meta- analysis… How do we compare, combine, and interpret findings across numerous quantitative studies of the same thing?  effect sizes & confidence intervals

33 An estimate of the magnitude or strength of a quantitative finding: …how much difference? …how much improvement? …how closely related? Effect size: What is it?

34 Effect sizes: absolute scales scaleStudy 1Study 2 1. percent Experimental group = 30% better than control Experimental group = 20% better than control 3. known measure Pre-post TOEFL score: 450  575 Pre-post TOEFL score: 450  495 Q: What happens when studies to not report findings on comparable scales? 2. correlation Motivation & achievement, r =.36 Motivation & achievement, r =.78

35 d is also simple to calculate and to interpret, and it incorporates variability differences between groups Effect size d = The average of the experimental group minus the average of the control group divided by the pooled standard deviation of both groups. Effect sizes: standardized

36 Difference between experimental and control groups in standard deviation units (Cohen’s d ) difference exper. contr. No sizeable effect ( d =0.10) difference exper. contr. Very large effect ( d =3.00) Effect sizes: standardized

37 Effect sizes for meta-analysis Study 1 Study 2 Study 3 Study 4 Study 5 Study … … effect size 1 effect size 2 effect size 3 effect size 4 effect size 5 … = average effect size

38 "The terms 'small,' 'medium,' and 'large' are relative, not only to each other, but to the area of behavioral science or even more particularly to the specific content and research method being employed in any given investigation..." (Cohen, 1988, p. 25) Interpreting effect sizes: What does d really tell us? d <.30 d >.30 d <.80 d >.80

39 The stroll from the hotel to the University is, on average, 10 minutes, plus or minus 3 minutes: The average is not enough  Confidence Intervals Upper bound= 13 minutes Average= 10 minutes Lower bound= 7 minutes “The margin of error in an observation” 95% certainty

40 Confidence Intervals in Meta-analysis CIs tell us about the certainty with which we can interpret an average effect size.

41 Effect Sizes and Confidence Intervals in Meta-analysis NKMean d SD d 95% CI lower 95% CI upper Avg. effect of instructional treatment 4998.96.87.781.14 We can be 95% certain that the actual effect of instruction lies between.78 and 1.14

42 Why does it help to focus on effect sizes? Smoking up to half a pack a day (or less than 10 cigarettes) a day increases the chance of mortality by 40% when compared to non-smokers Smoking two packs or more a day increases the risk of death by three times to 120% when compared to non-smokers U.S. Department of Health, Education, and Welfare Report, 1967 e.g., effects of Smoking research in the 1960s There is a statistically significant difference in mortality rates between smokers and non-smokers.

43 And what about small effects— can they be important too? r =.034 a truly ‘tiny’ effect! Regular aspirin consumption and decrease in heart attacks = 3.4% decrease = at least 3 out of 100 who would not have a heart attack if they regularly took aspirin. d =.30 a small magnitude effect! Effects of reading tutorials for underachieving students, the same for untrained peer tutoring and for highly trained teachers engaging in longer hours of tutoring. Both are important! Interpreting effect sizes: complex, contextualized, not absolute

44 1. Principled selection of studies 3. Direct use of the evidence reported (not the authors’ interpretations) 2. Systematic coding of each study for main variables Effect sizes, Confidence Intervals, Other kinds of new data based on old What are the definitional features of all syntheses (including all meta-analyses)?

45 How do we do it? An example of Synthesis+meta-analysis

46 In applied linguistics, the first full- blown synthesis and meta-analysis: Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50, 417-528.

47 Effects of instruction Recasts Garden path Input enhancement Input processing Input flood inductive Task-based interaction Traditional grammar Consciousness- raising dictogloss Step 1: Problem Specification

48 Focus of Norris & Ortega L2 instruction L2 learning RQ 1&2 Instruction Overall? By type? RQ 6: Quality of research practices? RQ 4: Instructional intensity? RQ 3: Effect of outcome measures? RQ 5: Durability of effects?

49 Step 2: Literature search 1 st  electronic searches 2 nd  other techniques: Manual searches of 14 journals Footnote chasing of 25 reviews Footnote chasing of each study included

50 Step 3: Study eligibility criteria Potentially relevant 250 >> >> relevant for synthesis 77 >> >> adequate for meta-analysis 49

51 Step 4: Coding of study features Type of instruction: FonF, FonFS, explicit, implicit Type of outcome measure: metalinguistic, selected, constrained, free Intensity of instruction: Brief (less than 1 hr), short (between 1 and 2 hrs), medium (between 3 and 6 hrs), long (more than 7 hrs) Durability of effects: effect sizes on delayed tests

52 Steps 5 & 6: Analyze, display, interpret

53 Findings RQ 1 & 2 (effectiveness):

54 Findings RQ 3 (type of measure)

55 Findings RQ 4 (intensity):

56 Findings RQ 5 (durability):

57 RQ 1-5 (meta-analysis part): How effective is L2 instruction? Clearly more effective than no instruction or only meaningful exposure to L2  d = 0.96 based on 49 studies Explicit instruction is superior in the short term to implicit instruction  d = 1.13 versus d = 0.54, based on 69 and 29 contrasts, respectively But focus on form and on formS are equally effective  d = 1.00 form versus 0.93 formS, based on 43 and 55 contrasts, respectively Effects are durable  delayed post-tests from 22 studies: d = 1.02

58 RQ6 (synthesis part): Research practices Too many variables in a single design  need to simplify designs, increase N No pre-test (18%), no true control group (83%)  need to always include both Poor reporting standards (52% no sd, 84% no instrument reliability, 57% no set alpha)  editors need to demand better reporting Misuse of statistical inference (no assumptions checked or met, parametric stats on small samples, no consideration of magnitude)  the field needs better training in statistics if they insist on using such methods

59 Since then…accumulation of meta-analyses In 2000, when Norris & Ortega was published, there were only 2 other published systematic syntheses in applied linguistics. As of 2010, Norris & Ortega identified 23 in their Timeline, most published since 2006. Motivation: Masgoret & Gardner (2003) Interaction: Keck et al. (2006), Mackey & Goo (2007) Oral feedback: Russell & Spada (2006), Lyster & Saito (2010), Li (2010) Use of glosses in CALL: Taylor (2006 & 2009), Abraham (2008)

60 Some challenges for research synthesis in L2 research…

61 Well known phenomenon, present in all the social sciences (Rosenthal, 1979; Rothstein et al., 2005) Little understood in applied linguistics Publication bias: “file drawer problem” Include fugitive literature Check for publication bias

62 The quality of a synthesis can only be as good as the quality of the primary studies that are synthesized in it... But how do we judge quality? Publication type? Methodology ratings? Exclusions? Quality: “garbage in, garbage out”

63 Anticipate consequences of synthesis Ethics Would it prematurely close the area for research? Would it taken as a personal attack on researchers/labs? What is the potential for findings to be (mis)appropriated by audiences (policy makers, teachers, …)?

64 High-tech statistication, cookie-cutter approach “... conceptual vacuum when technical meta-analytic expertise is not coupled with deep knowledge of the theoretical and conceptual issues at stake in the research domain under review…” (Norris & Ortega, 2006b, p. 37)

65 Meta-analysis only, no interest in quantitative synthesis of other kinds/scope New-generation meta-analyses bypass synthesis: Li (2010) Lyster & Saito (2010) Plonsky (2011) Spada & Tomita (2010) Thomas (1994), (2006) Ortega (2003) ?????

66 Yet, much contemporary research in applied linguistics is qualitative and increasingly more is mixed-methods… both worth synthesizing! Qualitative synthesis? No interest either in exploring qualitative synthesis… Only Téllez & Waxman (2006) in applied linguistics

67 Meta-ethnography (Noblit & Hare, 1988; see Téllez & Waxman, 2006) Qualitative Comparative Analysis (Ragin, 1999) Critical Interpretive Synthesis (Dixon-Woods et al., 2006) And there are options to draw from in education, health sciences, and other fields!

68 Value?

69 There is huge value in systematic synthesis (including meta-analysis): Secondary research, yes... but: Empirically accountable Conceptually illuminating: discovering new truths in old data

70 Sustained progress… Much improvement in certain reporting practices (LL, MLJ in particular) Larger N in primary studies = more trustworthy analyses Use of increasingly sophisticated techniques in meta- analyses…  study quality criteria, weighting (by N, reliability, variance), fixed/random effects models, sensitivity analysis, fill & trim estimations, publication bias, etc. Use of meta-analytic software, e.g.: http://www.meta-analysis.com

71 “we envision synthetic methodologies as advancing our ability to produce new knowledge by carefully building upon, expanding, and transforming what has been accumulated over time... However,... all knowledge is bound by context and purpose...” (Norris & Ortega, 2006b, p. 37) But only if applied linguists cultivate“the will to synthesis”

72 Thank You lortega@hawaii.edu

73 References Abraham, L. B. (2008). Computer-mediated glosses in second language reading comprehension and vocabulary learning: A meta-analysis. Computer Assisted Language Learning, 21, 199-226. Dixon-Woods, M., Bonas, S., Booth, A., Jones, D. R., Miller, T., Sutton, A. J., et al. (2006). How can systematic reviews incorporate qualitative research? A critical perspective. Qualitative Research, 6, 27-44. Keck, C. M., Iberri-Shea, G., Tracy-Ventura, N., & Wa-Mbaleka, S. (2006). Investigating the empirical link between task-based interaction and acquisition: A meta-analysis. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 91-131). Amsterdam: John Benjamins. Krashen, S., Long, M. H., & Scarcella, R. (1979). Accounting for child-adult differences in second language rate and attainment. TESOL Quarterly, 13, 573- 582. Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60, 309-365.

74 Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA: A meta- analysis. Studies in Second Language Acquisition, 32 (2). Mackey, A., & Goo, J. M. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 407-452). New York: Oxford University Press. Masgoret, A.-M., & Gardner, R. C. (2003). Attitudes, motivation, and second language learning: A meta-analysis of studies conducted by Gardner and associates. Language Learning, 53, 123-163. Noblit, G. W., & Hare, R. D. (1988). Meta-ethnography : Synthesizing qualitative studies. Newbury Park, CA: Sage. Norris, J. M. (2012). Meta-analysis. In C. Chapelle (Ed.), Encyclopedia of applied linguistics. Malden, MA: Wiley. Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50, 417-528.

75 Norris, J. M., & Ortega, L. (Eds.). (2006a). Synthesizing research on language learning and teaching. Amsterdam: John Benjamins. Norris, J. M., & Ortega, L. (2006b). The value and practice of research synthesis for language learning and teaching. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 3-50). Amsterdam: John Benjamins. Norris, J. M., & Ortega, L. (2007). The future of research synthesis in applied linguistics: Beyond art or science. TESOL Quarterly, 41, 805-815. Norris, J. M., & Ortega, L. (2010). Research timeline: Research synthesis. Language Teaching, 43, 461-479. Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24, 492-518. Ortega, L. (2010). Research synthesis. In B. Paltridge & A. Phakiti (Eds.), Companion to research methods in applied linguistics (pp. 111-126). London: Continuum.

76 Plonsky, L. (2011). The effectiveness of second language strategy instruction: A meta-analysis. Language Learning, 61 (4). Ragin, C. C. (1999). Using Qualitative Comparative Analysis to study causal complexity. Health Services Research, 34 (5 -Part 2), 1225-1239. Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar: A meta-analysis of the research. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 133-164). Amsterdam: John Benjamins. Spada, N., & Tomita, Y. (2010). Interactions between type of instruction and type of language feature: A meta-analysis. Language Learning, 60, 263-308. Taylor, A. M. (2006). The effects of CALL versus traditional L1 glosses on L2 reading comprehension. CALICO Journal, 23, 309-318. Taylor, A. M. (2009). CALL-based versus paper-based glosses: Is there a difference in reading comprehension? CALICO Journal, 27, 147-160.

77 Téllez, K., & Waxman, H. C. (2006). A meta-synthesis of qualitative research on effective teaching practices for English Language Learners. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 245-277). Amsterdam: John Benjamins. Thomas, M. (1994). Assessment of L2 proficiency in second language acquisition research. Language Learning, 44, 307-336. Thomas, M. (2006). Research synthesis and historiography: The case of assessment of second language proficiency. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 279-298). Amsterdam: John Benjamins.


Download ppt "Doing Synthesis and Meta-Analysis in Applied Linguistics Lourdes Ortega University of Hawai‘i at M ā noa National Tsing Hua University Taiwan, June 8,"

Similar presentations


Ads by Google