Doing Synthesis and Meta-Analysis in Applied Linguistics Lourdes Ortega University of Hawai‘i at M ā noa National Tsing Hua University Taiwan, June 8,

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

Oral Feedback in Classroom SLA
Postgraduate Course 7. Evidence-based management: Research designs.
RESEARCH CLINIC SESSION 1 Committed Officials Pursuing Excellence in Research 27 June 2013.
Action Research Not traditional educational research often research tests theory not practical Teacher research in classrooms and/or schools/districts.
Effect Size and Meta-Analysis
Session 1 Getting started with classroom research DAVID NUNAN.
Reviewing and Critiquing Research
Standards for Qualitative Research in Education
Introduction to Meta-Analysis Joseph Stevens, Ph.D., University of Oregon (541) , © Stevens 2006.
15 de Abril de A Meta-Analysis is a review in which bias has been reduced by the systematic identification, appraisal, synthesis and statistical.
Inferences About Means of Single Samples Chapter 10 Homework: 1-6.
Meta-analysis & psychotherapy outcome research
Lecture 10 Comparison and Evaluation of Alternative System Designs.
Reporting and Evaluating Research
Guidelines to Publishing in IO Journals: A US perspective Lois Tetrick, Editor Journal of Occupational Health Psychology.
Practical Meta-Analysis -- D. B. Wilson 1 Practical Meta-Analysis David B. Wilson.
Chapter 7. Getting Closer: Grading the Literature and Evaluating the Strength of the Evidence.
GS/PPAL Section N Research Methods and Information Systems A QUANTITATIVE RESEARCH PROJECT - (1)DATA COLLECTION (2)DATA DESCRIPTION (3)DATA ANALYSIS.
September 26, 2012 DATA EVALUATION AND ANALYSIS IN SYSTEMATIC REVIEW.
Writing the Research Paper
Writing a Research Proposal
Copyright © 2001 by The Psychological Corporation 1 The Academic Competence Evaluation Scales (ACES) Rating scale technology for identifying students with.
Are the results valid? Was the validity of the included studies appraised?
Qualitative Research.
Group Discussion Explain the difference between assignment bias and selection bias. Which one is a threat to internal validity and which is a threat to.
Advanced Statistics for Researchers Meta-analysis and Systematic Review Avoiding bias in literature review and calculating effect sizes Dr. Chris Rakes.
Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.
The Effect of Computers on Student Writing: A Meta-Analysis of Studies from 1992 to 2002 Amie Goldberg, Michael Russell, & Abigail Cook Technology and.
Systematic Reviews.
Copyright © Allyn & Bacon 2008 Locating and Reviewing Related Literature Chapter 3 This multimedia product and its contents are protected under copyright.
INTERNATIONAL SOCIETY FOR TECHNOLOGY IN EDUCATION working together to improve education with technology Using Evidence for Educational Technology Success.
Chapter 3 Copyright © Allyn & Bacon 2008 Locating and Reviewing Related Literature This multimedia product and its contents are protected under copyright.
September 19, 2012 SYSTEMATIC REVIEWS It is necessary, while formulating the problems of which in our advance we are to find the solutions, to call into.
Evaluating a Research Report
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
Learning Progressions: Some Thoughts About What we do With and About Them Jim Pellegrino University of Illinois at Chicago.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
LECTURE 2 EPSY 642 META ANALYSIS FALL CONCEPTS AND OPERATIONS CONCEPTUAL DEFINITIONS: HOW ARE VARIABLES DEFINED? Variables are operationally defined.
Experimental Research Methods in Language Learning Chapter 1 Introduction and Overview.
SOCIOLOGICAL INVESTIGATION
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
Teacher Training Programme for the Ministry of Higher and Secondary Special Education of the Republic of Uzbekistan.
Academic Research Academic Research Dr Kishor Bhanushali M
Developing a Review Protocol. 1. Title Registration 2. Protocol 3. Complete Review Components of the C2 Review Process.
Leadership Performance Assessment EDL586 Research: Research: The act of ‘coming to know’ The Two Goals of Researchers: The Two Goals of Researchers: 1.
META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
JS Mrunalini Lecturer RAKMHSU Data Collection Considerations: Validity, Reliability, Generalizability, and Ethics.
Research Methods Ass. Professor, Community Medicine, Community Medicine Dept, College of Medicine.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Systematic Synthesis of the Literature: Introduction to Meta-analysis Linda N. Meurer, MD, MPH Department of Family and Community Medicine.
RESEARCH An Overview A tutorial PowerPoint presentation by: Ramesh Adhikari.
to become a critical consumer of information.
Applied Opinion Research Training Workshop Day 3.
Chapter 14 Research Synthesis (Meta-Analysis). Chapter Outline Using meta-analysis to synthesize research Tutorial example of meta-analysis.
How Psychologists Do Research Chapter 2. How Psychologists Do Research What makes psychological research scientific? Research Methods Descriptive studies.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 27 Systematic Reviews of Research Evidence: Meta-Analysis, Metasynthesis,
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
CRITICALLY APPRAISING EVIDENCE Lisa Broughton, PhD, RN, CCRN.
Week Seven.  The systematic and rigorous integration and synthesis of evidence is a cornerstone of EBP  Impossible to develop “best practice” guidelines,
Writing a sound proposal
Do Adoptees Have Lower Self Esteem?
How to Research Lynn W Zimmerman, PhD.
Pedagogical grammar 4 Ortega and Norris.
Supplementary Table 1. PRISMA checklist
Two Halves to Statistics
Meta-analysis, systematic reviews and research syntheses
META-ANALYSIS PROCEDURES
Presentation transcript:

Doing Synthesis and Meta-Analysis in Applied Linguistics Lourdes Ortega University of Hawai‘i at M ā noa National Tsing Hua University Taiwan, June 8, 2011

Please cite as: Ortega, L. (2011). Doing synthesis and meta-analysis in applied linguistics. Invited workshop at Tsing Hua University, Taipei, June 8, Copyright © Lourdes Ortega, 2011

Research synthesis (including meta-analysis) 1.What is it? 2.Why do it? 3.How do we do it? 4.An example… 5.Challenges? 6.Value?

What is research synthesis?

The reviewing continuum S e c o n d a r y R e s e a r c h Narrative Systematic ……………..SYNTHESIS……………LIT REVIEW META-ANALYSIS

So, what is meta-analysis, specifically? …one specific kind of research synthesis… Secondary analysis of quantitative analyses Each primary study is a data point Goal: what are the main ‘effects’ or ‘relationships’ found across many studies? Strictly speaking, only quantitative studies apply

Why do it?

…have lead to unending debates: What does the evidence “say”? According to whom? How do we know who is right? Traditional literature reviews…

e.g.: error correction (Ferris vs. Truscott) e.g.: Critical Period Hypothesis (Hyltenstam et al. vs. Birdsong)

Typical strategies of traditional reviews?

Tables summarizing many studies

e.g. from Krashen et al. (1979):

Vote-counting technique

e.g.: Error correction in L2 writing

Limitations: No specific set of methods, up to mysterious expertise Experts are always vested, therefore vulnerable to charge of bias Statistical significance has serious pitfalls Idiosyncratic methodology Evidentiary warrants difficult to judge Over-reliance on statistical significance (but magnitude, not just generalizability, is of interest to social scientists!)

What does the evidence “say”? According to whom? How do we know who is right?

Methods for reviewing, from “art” into “science”: Systematic, not arbitrary More than the sum of the parts Replicable SOLUTION in the late 1970s Secondary, yes... but empirically accountable, & discovering new truths in old data

How do we do it?

Norris & Ortega (2006a, 2006b)

Norris, J. M., & Ortega, L. (2010). Timeline: Research synthesis. Language Teaching, 43, Ortega, L. (2010). Research synthesis. In B. Paltridge & A. Phakiti (Eds.), Companion to research methods in applied linguistics (pp ). London: Continuum. Norris, J. M. (2012). Meta-analysis. In C. Chapelle (Ed.), Encyclopedia of applied linguistics. Malden, MA: Wiley. Norris, J. M., & Ortega, L. (2007). The future of research synthesis in applied linguistics: Beyond art or science. TESOL Quarterly, 41,

1. Principled selection of primary studies 3. Direct use of the evidence reported (not the authors’ interpretations) across studies What are the definitional features of all syntheses (including meta-analyses)? 2. Systematic coding of each study for main variables

1. Principled selection of studies Sampling is central to empirical research  what population are we trying to understand? Random [experimental] Purposive [qualitative]

Sampling is central to synthesis, as well Complete [secondary research should be based on the full universe of studies that have investigated the same thing]

Search & Retrieval of Literature The literature search is a key step in systematic synthesis (some direction: In'nami & Koizumi, 2010)  identify all studies that are relevant Exhaustive [electronic, hand, footnote chasing invisible college] Replicable [fully explained in report]

1 st  electronic searches 2 nd  other techniques: Manual searches of journals Footnote chasing Forward searches with Web of Science Website searches of key contributing scholars Polite requests to authors & experts

Inclusion & Exclusion criteria All potentially relevant studies must then be examined to decide: Include or Exclude (“apples or oranges?”) Inclusion criteria [all criteria satisfied] Exclusion criteria [explain each reason for exclusion and give examples] Full rationale: [tables, appendices, philosophy of inclusivity or selectivity]

1. Principled selection of studies Literature search + Study eligibility criteria, Inclusion/exclusion What are the definitional features of all syntheses (including meta-analyses)?

2. Systematic coding of each study Eliciting evidence with consistency, just as when surveying, interviewing, or testing participants  Asking research questions of the literature: What variables are important? How (and how well) have they been investigated? What are the findings across studies?

Publication features Substantive features Methodological features e.g., How was “explicit” instruction defined? e.g., How was “learning” measured? e.g., Means, sd, etc? Sample size Design Reliability Stats used Etc. Year Author Published or Fugitive? Journal Book Dissertation Presentation Coding book to identify study features that answer questions Multiple coders

1. Principled selection of studies 2. Systematic coding of each study for main variables Coding book, Standardization, Intercoder reliability What are the definitional features of all syntheses (including all meta-analyses)?

Record carefully what authors report and how they report it,… But ultimately, analyze what the evidence they present tells us, not what they say it means… Seeking an objective view across studies of the accumulated state of knowledge… 3. Trust the evidence, not the authors

When aggregating and averaging findings is the goal, as in meta- analysis… How do we compare, combine, and interpret findings across numerous quantitative studies of the same thing?  effect sizes & confidence intervals

An estimate of the magnitude or strength of a quantitative finding: …how much difference? …how much improvement? …how closely related? Effect size: What is it?

Effect sizes: absolute scales scaleStudy 1Study 2 1. percent Experimental group = 30% better than control Experimental group = 20% better than control 3. known measure Pre-post TOEFL score: 450  575 Pre-post TOEFL score: 450  495 Q: What happens when studies to not report findings on comparable scales? 2. correlation Motivation & achievement, r =.36 Motivation & achievement, r =.78

d is also simple to calculate and to interpret, and it incorporates variability differences between groups Effect size d = The average of the experimental group minus the average of the control group divided by the pooled standard deviation of both groups. Effect sizes: standardized

Difference between experimental and control groups in standard deviation units (Cohen’s d ) difference exper. contr. No sizeable effect ( d =0.10) difference exper. contr. Very large effect ( d =3.00) Effect sizes: standardized

Effect sizes for meta-analysis Study 1 Study 2 Study 3 Study 4 Study 5 Study … … effect size 1 effect size 2 effect size 3 effect size 4 effect size 5 … = average effect size

"The terms 'small,' 'medium,' and 'large' are relative, not only to each other, but to the area of behavioral science or even more particularly to the specific content and research method being employed in any given investigation..." (Cohen, 1988, p. 25) Interpreting effect sizes: What does d really tell us? d <.30 d >.30 d <.80 d >.80

The stroll from the hotel to the University is, on average, 10 minutes, plus or minus 3 minutes: The average is not enough  Confidence Intervals Upper bound= 13 minutes Average= 10 minutes Lower bound= 7 minutes “The margin of error in an observation” 95% certainty

Confidence Intervals in Meta-analysis CIs tell us about the certainty with which we can interpret an average effect size.

Effect Sizes and Confidence Intervals in Meta-analysis NKMean d SD d 95% CI lower 95% CI upper Avg. effect of instructional treatment We can be 95% certain that the actual effect of instruction lies between.78 and 1.14

Why does it help to focus on effect sizes? Smoking up to half a pack a day (or less than 10 cigarettes) a day increases the chance of mortality by 40% when compared to non-smokers Smoking two packs or more a day increases the risk of death by three times to 120% when compared to non-smokers U.S. Department of Health, Education, and Welfare Report, 1967 e.g., effects of Smoking research in the 1960s There is a statistically significant difference in mortality rates between smokers and non-smokers.

And what about small effects— can they be important too? r =.034 a truly ‘tiny’ effect! Regular aspirin consumption and decrease in heart attacks = 3.4% decrease = at least 3 out of 100 who would not have a heart attack if they regularly took aspirin. d =.30 a small magnitude effect! Effects of reading tutorials for underachieving students, the same for untrained peer tutoring and for highly trained teachers engaging in longer hours of tutoring. Both are important! Interpreting effect sizes: complex, contextualized, not absolute

1. Principled selection of studies 3. Direct use of the evidence reported (not the authors’ interpretations) 2. Systematic coding of each study for main variables Effect sizes, Confidence Intervals, Other kinds of new data based on old What are the definitional features of all syntheses (including all meta-analyses)?

How do we do it? An example of Synthesis+meta-analysis

In applied linguistics, the first full- blown synthesis and meta-analysis: Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50,

Effects of instruction Recasts Garden path Input enhancement Input processing Input flood inductive Task-based interaction Traditional grammar Consciousness- raising dictogloss Step 1: Problem Specification

Focus of Norris & Ortega L2 instruction L2 learning RQ 1&2 Instruction Overall? By type? RQ 6: Quality of research practices? RQ 4: Instructional intensity? RQ 3: Effect of outcome measures? RQ 5: Durability of effects?

Step 2: Literature search 1 st  electronic searches 2 nd  other techniques: Manual searches of 14 journals Footnote chasing of 25 reviews Footnote chasing of each study included

Step 3: Study eligibility criteria Potentially relevant 250 >> >> relevant for synthesis 77 >> >> adequate for meta-analysis 49

Step 4: Coding of study features Type of instruction: FonF, FonFS, explicit, implicit Type of outcome measure: metalinguistic, selected, constrained, free Intensity of instruction: Brief (less than 1 hr), short (between 1 and 2 hrs), medium (between 3 and 6 hrs), long (more than 7 hrs) Durability of effects: effect sizes on delayed tests

Steps 5 & 6: Analyze, display, interpret

Findings RQ 1 & 2 (effectiveness):

Findings RQ 3 (type of measure)

Findings RQ 4 (intensity):

Findings RQ 5 (durability):

RQ 1-5 (meta-analysis part): How effective is L2 instruction? Clearly more effective than no instruction or only meaningful exposure to L2  d = 0.96 based on 49 studies Explicit instruction is superior in the short term to implicit instruction  d = 1.13 versus d = 0.54, based on 69 and 29 contrasts, respectively But focus on form and on formS are equally effective  d = 1.00 form versus 0.93 formS, based on 43 and 55 contrasts, respectively Effects are durable  delayed post-tests from 22 studies: d = 1.02

RQ6 (synthesis part): Research practices Too many variables in a single design  need to simplify designs, increase N No pre-test (18%), no true control group (83%)  need to always include both Poor reporting standards (52% no sd, 84% no instrument reliability, 57% no set alpha)  editors need to demand better reporting Misuse of statistical inference (no assumptions checked or met, parametric stats on small samples, no consideration of magnitude)  the field needs better training in statistics if they insist on using such methods

Since then…accumulation of meta-analyses In 2000, when Norris & Ortega was published, there were only 2 other published systematic syntheses in applied linguistics. As of 2010, Norris & Ortega identified 23 in their Timeline, most published since Motivation: Masgoret & Gardner (2003) Interaction: Keck et al. (2006), Mackey & Goo (2007) Oral feedback: Russell & Spada (2006), Lyster & Saito (2010), Li (2010) Use of glosses in CALL: Taylor (2006 & 2009), Abraham (2008)

Some challenges for research synthesis in L2 research…

Well known phenomenon, present in all the social sciences (Rosenthal, 1979; Rothstein et al., 2005) Little understood in applied linguistics Publication bias: “file drawer problem” Include fugitive literature Check for publication bias

The quality of a synthesis can only be as good as the quality of the primary studies that are synthesized in it... But how do we judge quality? Publication type? Methodology ratings? Exclusions? Quality: “garbage in, garbage out”

Anticipate consequences of synthesis Ethics Would it prematurely close the area for research? Would it taken as a personal attack on researchers/labs? What is the potential for findings to be (mis)appropriated by audiences (policy makers, teachers, …)?

High-tech statistication, cookie-cutter approach “... conceptual vacuum when technical meta-analytic expertise is not coupled with deep knowledge of the theoretical and conceptual issues at stake in the research domain under review…” (Norris & Ortega, 2006b, p. 37)

Meta-analysis only, no interest in quantitative synthesis of other kinds/scope New-generation meta-analyses bypass synthesis: Li (2010) Lyster & Saito (2010) Plonsky (2011) Spada & Tomita (2010) Thomas (1994), (2006) Ortega (2003) ?????

Yet, much contemporary research in applied linguistics is qualitative and increasingly more is mixed-methods… both worth synthesizing! Qualitative synthesis? No interest either in exploring qualitative synthesis… Only Téllez & Waxman (2006) in applied linguistics

Meta-ethnography (Noblit & Hare, 1988; see Téllez & Waxman, 2006) Qualitative Comparative Analysis (Ragin, 1999) Critical Interpretive Synthesis (Dixon-Woods et al., 2006) And there are options to draw from in education, health sciences, and other fields!

Value?

There is huge value in systematic synthesis (including meta-analysis): Secondary research, yes... but: Empirically accountable Conceptually illuminating: discovering new truths in old data

Sustained progress… Much improvement in certain reporting practices (LL, MLJ in particular) Larger N in primary studies = more trustworthy analyses Use of increasingly sophisticated techniques in meta- analyses…  study quality criteria, weighting (by N, reliability, variance), fixed/random effects models, sensitivity analysis, fill & trim estimations, publication bias, etc. Use of meta-analytic software, e.g.:

“we envision synthetic methodologies as advancing our ability to produce new knowledge by carefully building upon, expanding, and transforming what has been accumulated over time... However,... all knowledge is bound by context and purpose...” (Norris & Ortega, 2006b, p. 37) But only if applied linguists cultivate“the will to synthesis”

Thank You

References Abraham, L. B. (2008). Computer-mediated glosses in second language reading comprehension and vocabulary learning: A meta-analysis. Computer Assisted Language Learning, 21, Dixon-Woods, M., Bonas, S., Booth, A., Jones, D. R., Miller, T., Sutton, A. J., et al. (2006). How can systematic reviews incorporate qualitative research? A critical perspective. Qualitative Research, 6, Keck, C. M., Iberri-Shea, G., Tracy-Ventura, N., & Wa-Mbaleka, S. (2006). Investigating the empirical link between task-based interaction and acquisition: A meta-analysis. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp ). Amsterdam: John Benjamins. Krashen, S., Long, M. H., & Scarcella, R. (1979). Accounting for child-adult differences in second language rate and attainment. TESOL Quarterly, 13, Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60,

Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA: A meta- analysis. Studies in Second Language Acquisition, 32 (2). Mackey, A., & Goo, J. M. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp ). New York: Oxford University Press. Masgoret, A.-M., & Gardner, R. C. (2003). Attitudes, motivation, and second language learning: A meta-analysis of studies conducted by Gardner and associates. Language Learning, 53, Noblit, G. W., & Hare, R. D. (1988). Meta-ethnography : Synthesizing qualitative studies. Newbury Park, CA: Sage. Norris, J. M. (2012). Meta-analysis. In C. Chapelle (Ed.), Encyclopedia of applied linguistics. Malden, MA: Wiley. Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50,

Norris, J. M., & Ortega, L. (Eds.). (2006a). Synthesizing research on language learning and teaching. Amsterdam: John Benjamins. Norris, J. M., & Ortega, L. (2006b). The value and practice of research synthesis for language learning and teaching. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 3-50). Amsterdam: John Benjamins. Norris, J. M., & Ortega, L. (2007). The future of research synthesis in applied linguistics: Beyond art or science. TESOL Quarterly, 41, Norris, J. M., & Ortega, L. (2010). Research timeline: Research synthesis. Language Teaching, 43, Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24, Ortega, L. (2010). Research synthesis. In B. Paltridge & A. Phakiti (Eds.), Companion to research methods in applied linguistics (pp ). London: Continuum.

Plonsky, L. (2011). The effectiveness of second language strategy instruction: A meta-analysis. Language Learning, 61 (4). Ragin, C. C. (1999). Using Qualitative Comparative Analysis to study causal complexity. Health Services Research, 34 (5 -Part 2), Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar: A meta-analysis of the research. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp ). Amsterdam: John Benjamins. Spada, N., & Tomita, Y. (2010). Interactions between type of instruction and type of language feature: A meta-analysis. Language Learning, 60, Taylor, A. M. (2006). The effects of CALL versus traditional L1 glosses on L2 reading comprehension. CALICO Journal, 23, Taylor, A. M. (2009). CALL-based versus paper-based glosses: Is there a difference in reading comprehension? CALICO Journal, 27,

Téllez, K., & Waxman, H. C. (2006). A meta-synthesis of qualitative research on effective teaching practices for English Language Learners. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp ). Amsterdam: John Benjamins. Thomas, M. (1994). Assessment of L2 proficiency in second language acquisition research. Language Learning, 44, Thomas, M. (2006). Research synthesis and historiography: The case of assessment of second language proficiency. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp ). Amsterdam: John Benjamins.