Presentation is loading. Please wait.

Presentation is loading. Please wait.

Predictors of Programming Performance in the Successor Course Marija Brkić Bakarić, Higher Teaching Assistant Maja Matetić, Associate.

Similar presentations


Presentation on theme: "Predictors of Programming Performance in the Successor Course Marija Brkić Bakarić, Higher Teaching Assistant Maja Matetić, Associate."— Presentation transcript:

1 Predictors of Programming Performance in the Successor Course Marija Brkić Bakarić, Higher Teaching Assistant mbrkic@inf.uniri.hr Maja Matetić, Associate Professor majam@inf.uniri.hr Department of Informatics, University of Rijeka Radmile Matejčić 2, 51000 Rijeka, Croatia http://www.inf.uniri.hr

2 Objectives to increase the pass rate for 2nd year enrollment to 75% to investigate the accuracy of a range of classification techniques in predicting students at risk of failing the course Programming 2, based on their success in the course Programming 1 –of the most important tasks in the process of knowledge discovery is the selection of the algorithm that gets the best performance to solve a given problem –the performance of different algorithms is highly dependent on the nature of the training data

3 Course info Programming 2 mandatory course summer semester of the 1 st year of undergraduate study of Informatics C++ procedural programming LMS Moodle, supplemental instruction classes 89 students in our case study ActivityScores Online quiz 1 16 Online quiz 2 20 5 labs 5 x 10 Preparatory quiz 1 - Preparatory quiz 2 - Activity & homeworks 14

4 Motivation the goal of higher education institutions is the creation of human capital teachers are rated according to whether they meet students' needs and expectations a huge amount of data is generated in the educational process in the new digital era during the first year of their studies, students face a number of challenges: –to establish themselves in the new community –to adjust their learning strategies and styles

5 Datasets Dataset # of students # of instances Programming 2 – dataset A 8977 Programming 1 – dataset B 8277

6 Visualization as a pre-processing tool

7 Phase 1: cleaning raw data removing data and logs that refer to the activities of teachers, administrators and other users who do not attend he course defining attributes/features which define instances Are the learning habits transferred from course to course? Can students pass the course based solely on theoretical part of the course? Do students who do preparatory quizzes until maximum score is achieved pass the course? Are students who delay examinations on gain? Is there a sex difference in the programming success?

8 Dataset A features (33) Feature ScoreMaxMinMeanStDev Sex ----- Search 03603.5066.03 Sort 03506.3517.434 Prep1Attempts 02113.73.361 Prep1Days 0311.60.694 Prep11 01209.13.198 Prep1L 012611.2921.368 Prep1StDev -5.06501.9531.367 Prep2Attempts 01413.392.639 Prep2Days -411.5760.675 Prep21 022012.3396.418 Prep2L 022016.0878.203 Prep2StDev 012.0505.323.266 1st quiz 16 09.4453.775 2nd quiz 20 415.3573.947 Time1 -0:40:10:0:00:16:210:10:5 Time2 -0:25:00:5:360:15:420:5:42 Time -0:57:420:16:500:35:80:14:16

9 Dataset A features (33) (cont.) Feature ScoreMaxMinMeanStDev 1st lab 10 03.814.329 2nd lab 10 03.3393.951 3rd lab 10 042.849 4th lab 10 04.5713.051 5th lab 10 04.6493.774 Quizzes 36309237.231 1st lab N 8603.2731.849 1st lab NN 6000- 1st lab NNN 40000 2nd lab N 8703.3181.953 2nd lab NN 6603.252.754 2nd lab NNN 4000- 3rd lab N 8702.752.36 3rd lab NN 6300.8180.982 4th lab N 8601.4171.886

10 Joint features (19) Feature Bad GoodGreat Prep1Attempts Prep1Days Prep11 0-22-33-5 Prep1L 0-22-33-5 Prep1StDev Prep2Attempts Prep2Days Prep21 0-88-1515-22 Prep2L 0-88-1515-22 Prep2StDev 1st quiz 0-55-1111-16 2nd quiz 0-77-1313-20 Time1 Time2 Time 1st lab 0-44-77-10 2nd lab 0-44-77-10 5th lab* 0-1313-2424-36 Quizzes 0-1212-2424-36

11 Classification task – pass or fail? Correctly classified instances Training setZero R Naïve Bayes J48 (unpruned) kNN Multilayer Perceptron Dataset A - 3367.53%90.91%77.92%70.13%88.31% Dataset A - 1967.53%88.31%81.82%67.53%83.12% Dataset B67.53%75.32%84.42%67.53%83.12% Dataset A&B73.38%91.56%79.22%70.78%85.71%

12 Classification task – pass or fail?

13 Classification task through assignment weeks (J48) 1st lab67.53% 2nd lab85.71% 1st lab, 2nd lab, Prep1Attempts, Prep1Days, Prep11, Prep1L, Prep1StDev88.31% 1st lab, 2nd lab, Prep1Attempts, Prep1Days, Prep11, Prep1L, Prep1StDev, quiz1 88.31% 1st lab, 2nd lab, Prep1Attempts, Prep1Days, Prep11, Prep1L, Prep1StDev, quiz1, Prep2Attempts, Prep2Days, Prep21, Prep2L, Prep2StDev 88.31% 1st lab, 2nd lab, Prep1Attempts, Prep1Days, Prep11, Prep1L, Prep1StDev, quiz1, Prep2Attempts, Prep2Days, Prep21, Prep2L, Prep2StDev, quiz2 84.42% 1st lab, 2nd lab, Prep1Attempts, Prep1Days, Prep11, Prep1L, Prep1StDev, quiz1, Prep2Attempts, Prep2Days, Prep21, Prep2L, Prep2StDev, quiz2, 5th lab 84.42% 1st lab, 2nd lab, Prep1Attempts, Prep1Days, Prep11, Prep1L, Prep1StDev, quiz1, Prep2Attempts, Prep2Days, Prep21, Prep2L, Prep2StDev, quiz2, 5th lab, quizzes 84.42%

14 Midterm classification88.31% Classified as pass Classified as fail True pass 520 True fail 916

15 Midterm classification 88.31% Classified as pass Classified as fail True pass 520 True fail 916 Instances with missing values removed 95.56% Classified as pass Classified as fail True pass 270 True fail 216

16 Decision tree

17 Feature assessments Chi-square, One R-test, Info Gain, Gain Ratio AVG rank in TOP10Dataset BDataset ADataset ABDataset A + sexDataset A-33 1st lab4.631.253.931.251.40 2nd lab5.382.504.432.583.60 1st quizz4.936.305.387.3015.63 5th lab8.435.438.709.4019.90 sex---7.6318.00

18 Feature assessments Dataset A-33 1st labSearch3rd lab NNSex5th lab3rd lab Prep1Attem pts 2nd lab1st lab N4th lab N1st lab NNN Prep2Attem pts Prep21Prep2L Sort2nd lab NN3rd lab N 2nd lab NNN Prep2Days4th labPrep2StDev 2nd quizzQuizzes1st quizzPrep11Prep1StDevTime2 Time12nd lab N1st lab NNPrep1DaysPrep1LTime

19 Answer 1 The learning habits are not transferred from course to course (only 60% of the students belong to the same cluster in dataset AB). Dataset ADataset B Cluster 0Cluster 1Cluster 0Cluster 1 Prep1Attempts4.211.218.81.5 Prep1Days1.690.682.320.83 Prep1StDev1.870.391.030.16 Time11199.53703.85791.52729.07 Prep2Attempts4.020.798.922.94 Prep2Days1.770.52.240.94 Prep2StDev5.620.245.211.58 Time2920.19415722.91689.56 Time146.53310.5687.8422.13 Total43 (56%)34 (44%)25 (32%)52 (68%) Fail19%50%4%29%

20 Answer 2 P1 Students can pass the course based solely on theoretical part of the course (the association rule extracted by the Apriori algorithm with a minimum support of 0.2 and a confidence of 96% - quiz1 + quiz2 or only quiz2).

21 Answer 2 (nast.) P2 Students can pass the course based solely on theoretical part of the course.

22 Answer 3 There is no sex difference in the programming success! –sex as a feature slightly deteriorates performance on dataset B with J48 classifier

23 Question 4 Do students who do preparatory quizzes until maximum score is achieved pass the course?

24 Question 5 Are students who delay examinations on gain?

25 Conclusion We can predict students at risk of failing the course based on the predecessor course with the accuracy of 95%. A passing threshold should be set separately on the theoretical and practical side of the course. Preparatory quizzes attempts should be limited in order to give more meaningful results. There is so much more to find out by EDM!

26 Thank You for your attention!


Download ppt "Predictors of Programming Performance in the Successor Course Marija Brkić Bakarić, Higher Teaching Assistant Maja Matetić, Associate."

Similar presentations


Ads by Google