Presentation is loading. Please wait.

Presentation is loading. Please wait.

By: Diane Marie Lee Ma. Mercedes Rodrigo Ryan Baker Jessica Sugay

Similar presentations


Presentation on theme: "By: Diane Marie Lee Ma. Mercedes Rodrigo Ryan Baker Jessica Sugay"— Presentation transcript:

1 Exploring the Relationship Between Novice Programmer Confusion and Achievement
By: Diane Marie Lee Ma. Mercedes Rodrigo Ryan Baker Jessica Sugay Andrei Coronel

2 Affective States and Achievement
Recent studies have illustrated the relationships between affective states and achievement Negative affective states have negative impact on student’s achievement (Craig et al, 2006; Rodrigo, 2009; Lagud, 2010) Craig = Autotutor; boredom has negative correlation with learning gains Rodrigo = boredom and confusion = lower achievement in the programming course Lagud = Aplusix = highest levels of boredom and confusion among low-achieving students

3 Confusion Double-edged/ Dual Nature (D’Mello 2009) Harmful Helpful

4 Goal Discovery-with-models approach to finding the relationship between novice programmer confusion and achievement

5 Data Collection 149 students enrolled in CS21a – Introduction to Computing I Four lab sessions BlueJ IDE BlueJ Plug-in (Jadud and Henriksen, 2009) BlueJ IDE plug-in by Jadud Connected to a SQLite database server

6 Data Collection Compilation logs include
Compilation logs = all submissions made to the compiler Compilation logs include Computer number Timestamp Code Error message (if any) And many more!

7 Data Collection Total of 340 student-lab sessions
Total of 13,528 compilation logs collected 13,000++ of compilation logs are to be too hard to label Can’t see confusion in just one compilation = must look at group of compilations

8 Data Labeling Sorted the compilations by student and by Java class name Grouped the compilations into clips Clips = 8 compilations Total: 2,386 clips Raters were asked to label a sample of 664 clips Wanted to have a representative sample, so we’ll get 2 clips per student However, some students are so amazing that they had less than 16 compilations during the whole lab session

9 Data Labeling Used low-fidelity text replays Labels
Maintains good inter-rater reliability and efficient in aiding coders to label student disengagement (Baker et al. 2006) Labels Confused Not Confused Bad Clip Cohen’s Kappa between raters: 0.77 The same error appeared in the same general vicinity within the code for several consecutive compilations. The coders inferred that the student did not know what was causing the error and how to fix it. An assortment of errors appeared in consecutive compilations and remained unresolved. The coders inferred that the student was experimenting solutions, changing the actual error message but not addressing the real source of the error. Code malformations that showed a poor understanding of Java constructs,e.g. “return outside method”. The coders inferred that the student did not grasp even the basics of program construction, despite the availability of written aids such as Java code samples and explanatory slides.

10

11 Data Labeling Filter out “bad clips”
Remove clips where raters disagreed on the label Left with 418 clips for model construction

12 Model Construction Used RapidMiner version 5.1 Used J48 Decision Trees
Features were mined from the clips J48 Decision Trees with 10 fold batch cross validation at the student level

13 Model Construction Feature set used: Average time between compilations
Maximum time between compilations Average time between compilations w/ errors Maximum time between compilations w/ errors Number of compilations w/ errors Number of pairs consecutive compilations ending w/ the same error Time- and error-related features

14 Kappa: 0.86

15 Data Relabeling Model was coded as a Java program
Had the program relabel all the 2,386 clips Generated three sets of confused-not confused sequences Correlated the percentage of the sequences of each student to their midterm exam scores We counted the number of occurrences of each state or sequence per student within each set. The total number of sequences per student varied. We there- fore normalized the data by dividing the number of occurrences of each state or sequence per student by the total number of occurrences for that student.

16 Not Confused-Not Confused Not-Confused-Confused Confused-Not Confused
Results Not Confused-Not Confused Not-Confused-Confused Confused-Not Confused Confused-Confused Relationship with midterm .064 .139  .144 -.229 (0.539) (0.180) (0.163) (0.026) R = above P = below

17 Results NNN NNC NCN NCC CNN CNC CCN CCC -.015 .014 .062 -.046 .233
NNN NNC NCN NCC CNN CNC CCN CCC Relationship with Midterm -.015 .014 .062 -.046 .233 .163 .052 -.337 (.901) (.909) (.610) (.704) (.05) (.174) (.665) (.004)

18 Conclusion Prolonged confusion has a negative impact on student’s performance Resolved confusion has a positive impact on student’s performance A certain amount of confusion is needed for learning

19 On-going Work Support the incorporation of tools for automatic detection of confusion in computer science learning environments Redoing the sampling and clipping method

20 Thank you Questions? Confusion = thrashing = repetitively getting errors Area for future work = go deeper to confusion literature


Download ppt "By: Diane Marie Lee Ma. Mercedes Rodrigo Ryan Baker Jessica Sugay"

Similar presentations


Ads by Google