Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFO4990 Information Technology Research Methods (July, 2004)

Similar presentations


Presentation on theme: "INFO4990 Information Technology Research Methods (July, 2004)"— Presentation transcript:

1 INFO4990 Information Technology Research Methods (July, 2004)
Experimentation INFO4990 – Week 6 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

2 INFO4990 Information Technology Research Methods (July, 2004)
Agenda Experimentation in Computer Science and information systems research Basic experimentation concepts Some widely used experimental design in CS and IS field Analyze data from experiment study Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

3 INFO4990 Information Technology Research Methods (July, 2004)
History Experiment in natural science systematic acquisition of new knowledge, testing theory about nature Agriculture Chemistry Experimentation in social, psychology and economic studies Study people’s behavior E.g., fairness study Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

4 Experiment in computer science research
Derived from natural science experimentation Computer systems performance analysis Hardware Software Algorithm Network Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

5 Experimentation in Information System research
Derived from social and economic experimentation Subject under study is usually human Human behavior with regard to information system Hyperlink transferred trustiness Which subject is most suitable for distance learning Help understand the social context, or role of information system, design interface. Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

6 INFO4990 Information Technology Research Methods (July, 2004)
Purpose of experiment Discover and confirm causal relationship Examine the possible influences that one factor or condition may have on another factor or condition Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

7 Basic experimentation concepts
Independent variable Cause Research “measure” (manipulate) independent variable by creating a condition or situation Manipulation of independent variable create different treatments. Event manipulation Affecting the independent variable by altering the events that subjects experience Presence versus absence Instructional manipulation Varying the independent variable by giving different sets of instructions to the subjects Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

8 Basic experimentation concepts (cont)
Effect (outcome) Physical conditions, behaviors, attitudes, feelings, or beliefs of subjects that change in response to a treatment. How to measure IS research: various data collection methods Questionnaire, interviews, observation, test CS research: Metrics in the field Performance time, rate, error rate, time to failure and duration Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

9 The importance of control
Internal validity -- The extent to which we can accurately state that the independent variable produced the observed effect Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

10 INFO4990 Information Technology Research Methods (July, 2004)
Experiment cases A marketing researcher wants to study how humor in television commercials affects sales. To do so, the researcher studies the effectiveness of two commercials that have been developed for a new soft drink called Zowie. One commercial, in which a well-known but serious television actor describes how Zowie has a zingy and a refreshing taste, airs during the months of March, April and May. The other commercial, a humorous scenario in which several teenagers throw Zowie at on another on a hot summer day, airs during the months of June, July, and the August. The researcher finds that in June through August, Zowie sales are almost double what they were in the preceding three months. “Humor boost sales,” the research concludes. Many alternative explanations Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

11 Strategies to achieve control
Keep some things constant What are variables that need to be held constant in most experiments? Include a control group Treatment group (experimental group) Between-subjects design Randomly assign people to groups Use matched pairs Matched-subject design Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

12 Between and matched-subjects design
1 8 3 2 6 7 10 4 9 5 Random assignment treatment control DV 2 3 7 5 9 2 8 1 10 6 4 Randomly assign one member of each pair to each group Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

13 Steps in conducting an experiment
Identify the relevant variables State hypotheses Decide on an experimental design Decide the way to manipulate independent variables Develop a valid and reliable measure for dependent variable Pilot testing the treatment and dependent variable measures Recruit subjects (or locate cases) Assign subject to groups Introduce treatment to treatment groups Gather data for measure of the dependent variables Hypotheses testing Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

14 INFO4990 Information Technology Research Methods (July, 2004)
Experimental design One shot case study True experimental design Factorial design Block design Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

15 Classic true experimental design
pretest-posttest Treatment Versus control group Randomized Experimental design Vertical alignment shows two Pretests are measured at same time Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

16 INFO4990 Information Technology Research Methods (July, 2004)
Factorial design Two or more independent variables are manipulated in a single experiment They are referred to as factors The major purpose of the research is to explore their effects jointly Factorial design produce efficient experiments, each observation supplies information about all of the factors Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

17 INFO4990 Information Technology Research Methods (July, 2004)
A simple example Investigate an education program with a variety of variations to find out the best combination Amount of time receiving instruction 1 hour per week vs. 4 hour per week Settings In-class vs. pull out 2 X 2 factorial design Number of numbers tells how many factors Number values tell how many levels The result of multiplying tells how many treatment groups that we have in a factorial design Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

18 Factorial designs in computer system performance analysis
Personal workstation design Processor: 68000, Z80, 8086 Memory size: 512K 2M or 8M bytes Number of disks: one, two or three Workload: Secretarial, managerial or scientific User education: high school, college, post-graduate level Dependent variable Throughput, response time Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

19 INFO4990 Information Technology Research Methods (July, 2004)
22 factorial design Two factors, each at two levels Example: workstation design Factor 1: memory size Factor 2: cache size DV: performance in MIPS Cache size Memory size 4M byte 8M byte 1K 15 45 2K 25 75 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

20 INFO4990 Information Technology Research Methods (July, 2004)
2K factorial design K factors, each at two level 2K experiments 23 design example In designing a personal workstation, the three factors needed to be studied are: cache size, memory size and number of processors Factor Level -1 Level 1 Memory size 4Mbytes 16Mbytes Catch size 1Kbytes 2Kbytes Number of processors 1 2 Cache size (Kbytes) 4 Mbytes 16 Mbytes 1 proc 2 proc 1 14 46 22 58 2 10 50 34 86 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

21 Full and fractional factorial design
Full factorial design Study all combinations Can find effect of all factors Fractional (incomplete) factorial design Leave some treatment groups empty Less information May not get all interactions No problem if interaction is negligible Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

22 2 factors full factorial design
Used where there are two factors that are carefully controlled Examples in computer system performance analysis To compare several processors using several workload To determine two configuration parameters such as cache and memory size Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

23 2 factors full factorial design (cont)
Example: cache comparison workload Two caches One caches No caches ASM 54.0 55.0 106.0 TECO 60.0 123.0 SIEVE 43.0 120.0 DHRYSTONE 49.0 52.0 111.0 SORT 50.0 108.0 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

24 Field and controlled laboratory experiment
Field experiment Experiments conducted in real-life or field settings Researcher has less control over the experimental condition Greater external validity but lower internal validity Controlled laboratory experiment Conducted under controlled conditions of a laboratory Greater internal validity but lower external validity Practical consideration Planning and pilot testing Instruction to subjects Post experiment interview Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

25 Example of field and controlled laboratory experiments
Field experiment The case in slide 10 A controlled laboratory version Ask two group of subject (students) to view the tape of two different Ads (event manipulation). Use questionnaire to collect their intentions to buy the product. Compare the response from the two groups Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

26 Analyzing data from between subject design
Problem You want to measure the acquisition of mathematical skills by distance learning and traditional classroom learning. The study involves the comparison of 20 students, ten taught in classroom and ten taught by distance learning program. The final test scores were collected as dependent variable. DL CL 94 90 89 91 76 83 85 81 88 74 65 60 70 69 72 63 68 62 64 77.1 73.6 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

27 Why can’t we just compare the means
The difference between the means is the same in all three. They tell very different stories When we are looking at the differences between scores for two groups, we have to judge the difference between their means relative to the spread of variability of their scores Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

28 INFO4990 Information Technology Research Methods (July, 2004)
T-test t-test Assesses whether the means of two groups are statistically different from each other Sample size is small Approximately normal distribution of the measure in the two groups is assumed Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

29 INFO4990 Information Technology Research Methods (July, 2004)
Perform t-test Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

30 INFO4990 Information Technology Research Methods (July, 2004)
Interpret result Set a significance level Degree of freedom N1+N2 - 2 Compare t-value with critical value from t-distribution to see if it is larger enough to be significant Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

31 Analyzing data from matched subject design
Problem You want to compare the hit rate of a two cache algorithms. The simulated cache algorithms are running on 5 benchmarks and the hit rate were recorded Cache 1 Cache 2 0.91 0.95 0.67 0.65 0.85 0.90 0.73 0.80 0.93 0.97 0.818 0.854 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

32 Suitable test: Paired t-test
Calculation of t-value Degree of freedom N-1 Cache 1 Cache 2 Difference D2 B1 0.91 0.95 -0.04 0.0016 B2 0.67 0.65 0.02 0.0044 B3 0.85 0.90 -0.05 0.0025 B4 0.73 0.80 -0.07 0.0049 B5 0.93 0.97 Total -0.18 0.011 Avg -0.036 Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

33 Analyzing data from factorial design
Problem The memory-cache experiments were repeated three times each. The result is shown right What we want to find out Which factor contribute most to the performance What’s the joint effect of the two factors Cache size Memory size 4M 8M 1 K 15 18 12 (15) 45 48 51 (48) 2K 25 28 19 (24) 75 81 (77) Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

34 INFO4990 Information Technology Research Methods (July, 2004)
Suitable test: ANOVA 2 way ANOVA (Analysis of Variance) F-value Between-sample variation/within-sample variation Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

35 INFO4990 Information Technology Research Methods (July, 2004)
Statistical package Excel SPSS SAS Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)

36 INFO4990 Information Technology Research Methods (July, 2004)
References Paul D. Leedy and Jeanne Ellis Ormrod << Practical Research: Planning and Design >> 7th edition Robert.B.Burns <<Introduction to Research Methods>> 4th edition Raj Jain <<The art of computer system performance analysis by >> Monday, August 30, 2004 INFO4990 Information Technology Research Methods (July, 2004)


Download ppt "INFO4990 Information Technology Research Methods (July, 2004)"

Similar presentations


Ads by Google