Presentation on theme: "ICT 1 Generalization from empirical studies Tore Dybå:Session introduction (~20 min.) Erik Arisholm:Generalizing results through a series of replicated."— Presentation transcript:
ICT 1 Generalization from empirical studies Tore Dybå:Session introduction (~20 min.) Erik Arisholm:Generalizing results through a series of replicated experiments on software maintainability (~20 min.) Jeff Carver:Methods and tools for supporting generalization (~20 min.) Mini-group discussions (~10 min.) Plenary discussion (~20 min.) ISERN Meeting, Noosa Heads, Queensland, Australia 14–15 November, 2005
ICT 2 Generalization from Empirical Studies in SE: Session Introduction Tore Dybå SINTEF ICT email@example.com ISERN Meeting, Noosa Heads, Queensland, Australia 14–15 November, 2005
ICT 3 (Some of) the problem Empirical SE research often generalizes about software organizations as if they were all alike, or refrains from generalizing at all, as if they were all unique: In the first case, it is never really clear that findings about organizations actually sampled apply to organizations not sampled. With respect to the second, is there really any point in studying software organizations if one does not believe that common denominators exist among relatively large classes of organizations? We must become more concerned about the conditions under which our research findings are valid if our work is to be applied more widely.
ICT 4 Generalization is closely related to construct validity and external validity Construct validity: the degree to which inferences are warranted from the observed persons, settings, and cause and effect operations included in a study to the constructs that these instances might represent.* External validity: the validity of inferences about whether the causal relationship holds over variations in persons, settings, treatment variables, and measurement variables.* *W.R. Shadish, T.D. Cook, and D.T. Campbell (2002) Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Houghton Mifflin Company.
ICT 5 Statistical, sampling-based generalization The statistician’s traditional two-step ideal of the random selection of units for enhancing generalization; and the random assignment of those units to different treatments for promoting causal inference; is often advocated as the gold standard for empirical studies. However, this model is of limited utility for generalized causal inference in empirical SE because it assumes that random selection and its goals do not conflict with random assignment and its goals; it is rarely relevant for making generalizations about systems, tasks, settings, treatments and outcome variables; ethical, political, logistical, and economical constraints often limit random selection to less meaningful populations.
ICT 6 The “painful” problem of induction Hume’s truism: In past experience, all tests have confirmed Theory 1. Therefore, the next test will confirm Theory 1 or all tests will confirm Theory 1. “… induction or generalization is never fully justified logically. Whereas the problems of internal validity are solvable within the limits of the logic of probability of statistics, the problems of external validity are not logically solvable in any neat, conclusive way. Generalization always turns out to involve extrapolation into a realm not represented in one’s sample. Such extrapolation is made by assuming one knows the relevant laws.”* *D.T. Campbell and J.C. Stanley (1963) Experimental and Quasi-Experimental Designs for Research, Houghton Mifflin Company, p. 17.
ICT 7 Yin’s conception of generalization* theoryrival theory *R.K. Yin (2003) Case Study Research: Design and Methods, Third Edition, Sage Publications. samplesubjects case study findings population characteristics experimental findings Level-2 inference (Analytical) Level-1 inference (Statistical)
ICT 8 Lee and Baskerville’s framework* *A.S. Lee and R.L. Baskerville (2003) Generalizing Generalizability in Information Systems Research, Information Systems Research, 14(3):221-243. EE Generalizing from data to description TE Generalizing from theory to description ET Generalizing from description to theory TT Generalizing from concepts to theory Generalizing to empirical statements Generalizing to theoretical statements Generalizing from empirical statements Generalizing from theoretical statements
ICT 9 Shadish, Cook, and Campbell* Five principles of generalized causal inference Surface similarity: judging the apparent similarities between what was studied and the targets of generalization. Ruling out irrelevancy: identifying those attributes of persons, settings, treatments, and outcome measures that are irrelevant because they do not change a generalization. Making discriminations: making discriminations that limit generalization (e.g., from the lab to the field). Interpolation and extrapolation: interpolating to unsampled values within the range of the sampled persons, settings, treatments, and outcomes and by extrapolating beyond the sampled range. Causal explanation: developing and testing explanatory theories about the target of generalization. *W.R. Shadish, T.D. Cook, and D.T. Campbell (2002) Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Houghton Mifflin Company.
ICT 10 Summary Formal sampling-based methods are of limited use for generalizing from empirical SE studies. specifically so for tasks, settings, treatments, and outcome measures Additionally, there’s a dilemma between scientific validity (complying with Hume’s truism) and practical impact (applying a theory in a new organizational setting). Although we should advocate the two-step model of random sampling followed by random assignment when it is feasible, we cannot advocate it as the model for generalized causal inference in SE. So, SE researchers must use other concepts and methods to explore generalization from empirical SE studies. In fact, most SE researchers routinely make such generalizations without using formal sampling theory. In the rest of this session we will attempt to make explicit the concepts and methods used in such work. We turn to examples of such alternative methods now …
ICT 11 Mini-group and plenary discussions Form mini-groups with three persons – without leaving your chairs (first three, next three, etc.) Discuss the following two questions in the mini-groups for ~10 minutes: How do you generalize the results from YOUR studies? How can you improve the validity of these generalizations? Plenary discussion based on viewpoints from the groups