Download presentation

Presentation is loading. Please wait.

Published byLeticia Hollier Modified about 1 year ago

1
指導教授：呂學毅 教授 學生：陳彥璋 Using data mining methods to identify college freshmen who need special assistance in their academic performance.

2
Introductionreview literaturemethod background collegesInstructiongrade 加文字敘述

3
studentsfactors grade Introductionreview literaturemethod background (cont) 加文字敘述

4
Introductionreview literaturemethod background (cont) Lower grade

5
Introductionreview literaturemethod background (cont) College freshmen Lower grade Have more adapting problems than other higher grades. will affect the physical and mental health. will affect grade in future several semesters.

6
Introductionreview literaturemethod background (cont) ※ Taking Taiwan's national Yunlin university of science and technology for example ： From academic year 94 building 「強化學生輔導新體制工作計畫」. Building the achievements warning policy for students' academic achievement. Learning Career Planning Emotion & life problems Guidance in collage For low academic achievement

7
Introductionreview literaturemethod motivation (cont) Out of the test scores Final exam List of need special assistance students This study ： The freshmen entering time The general practice ： Out of the test scoresFinal exam time List of need special assistance students

8
Introductionreview literaturemethod motivation grade Family, Intelligence, Sex, … Emotion, Personality, … Learning Motivation, learning engagement, …

9
Introductionreview literaturemethod objective The aim of this study is to construct a model with data mining tools in predicting college freshmen of low academic achievement. Finding students who need special assistance in their academic performance, and help students with improving their academic performance through guidance as earlier as possible.

10
Introductionreview literaturemethod The negative effects of low academic achievement.

11
Introductionreview literaturemethod The problems of college freshmen with low academic achievement. authoryearfinding 業邵國，何英奇，陳舜芬 2007 As the college environment is more complex than high school, the freshmen who attend the new environment will encounter a lot of adapting problems. 潘正德 2007 The college freshmen who have poor academic achievements may be affected in future several semesters. 黃春枝 1999 The college freshmen who attend the new environment will encounter more adapting problems than other higher grades.

12
Introductionreview literaturemethod The relationship between personality (emotion) and grade. authoryearfinding McIlroy & Bunting2002 The students have good personality and behavior will contribute to their academic performance. Busato, Prins, Elshout, Hamaker 2000 Intelligence, personality, motivation and academic achievement of the students have a positive correlation. Yeh et al.2007 The students have a anxiety or depression whoes academic achievements will be affected Parker et al.2004 Emotional and academic achievement of the students have a correlation.

13
Introductionreview literaturemethod Forecasting model construct process

14
Introductionreview literaturemethod Coding Data attributes of primary data 屬性名稱屬性內容屬性類型屬性個數 性別男性；女性二元變數 2 學院 工程學院；管理學院； 設計學院；人文與科學學院 類別變數 4 社交性 0 至 20 分數值變數 - 主導性 0 至 20 分數值變數 - 行動力 0 至 20 分數值變數 - 思考性 0 至 20 分數值變數 - 活動性 0 至 20 分數值變數 - 攻擊性 0 至 20 分數值變數 - 挑剔性 0 至 20 分數值變數 - 客觀性 0 至 20 分數值變數 - 神經質 0 至 20 分數值變數 - 自卑感 0 至 20 分數值變數 - 情緒轉變性 0 至 20 分數值變數 - 憂鬱性 0 至 20 分數值變數 - 憂鬱程度正常；輕微；明顯；嚴重類別變數 4 名單結果一般學生；高關懷學生二元變數 2

15
Introductionreview literaturemethod Feature selection Some attributes are noisy or redundant.This noise makes it more difficult to discover meaningful patterns from the data. Dash, 〈 1997 〉 Sequential Backward Selection ： Using Shannon‘s Entropy as identification rule to find out attributes that have more explanatory capability. Sequential Backward Selection ： T = Original Variable Set For k = 1 to M – 1 {/* Iteratively remove variables one at a time */ For every variable v in T {/* Determine which variable to be removed */ Tv = T – v Calculate E Tv on D using eqn. 1} Let vk be the variable that minimizes E Tv T = T – vk /* Remove vk as the least important variable */ Output vk }

16
Introductionreview literaturemethod Feature selection (cont)

17
Introductionreview literaturemethod Data mining

18
Introductionreview literaturemethod Data mining ： K-Fold Cross-vaildation K-Fold is mainly used in settings where the goal is prediction. To estimate how accurately a predictive model will perform in practice. One round of cross-validation involves partitioning a sample of data into complementary subsets. Performing the analysis on one subset (called the training set). Vthe analysis on the other subset (called the testing set). To reduce variability, multiple rounds of cross-validation are performed using different partitions, and the validation results are averaged over the rounds.

19
Introductionreview literaturemethod Data mining ： C4.5 Decision Trees C4.5 is an extension of Quinlan's earlier ID3 algorithm 〈 1993 〉. The decision trees generated by C4.5 can be used for classification. internal node (attribute) branches leaf node (class)

20
Introductionreview literaturemethod Data mining ： C4.5 Decision Trees(cont)

21
Introductionreview literaturemethod Data mining ： Naïve Bayes Classifier

22
Introductionreview literaturemethod Data mining ： MLP Artificial neural network A ANN model where members of the class are obtained by varying parameters, connection weights, or specifics of the architecture such as the number of neurons or their connectivity. neurons weight adder active function y

23
Introductionreview literaturemethod Data mining ： MLP Artificial neural network (cont) A multilayer perceptron (MLP) is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate output. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Input Layer hidden layer Output Layer

24
Introductionreview literaturemethod Data mining ： MLP Artificial neural network (cont)

25
Introductionreview literaturemethod model evaluation ： Confusion Matrix predicted class TrueFalse actual class PositveTPFN NegativeFPTN Accuracy = Sensitivity (true positive rate) = Specificity = false positive rate =

26
Introductionreview literaturemethod model evaluation ： Receiver Operating Characteristic

27
Expected result This study is expect to construct a forecasting model through collage freshmen’s data. The forecasting model is using data mining methods to constructed that be select from three classifier (C4.5 decision trees, Naïve Bayes classifier, MLP artificial neural network). Using the forecasting model can identify college freshmen who need special assistance in their academic performance. And the collages can use model to help students with improving their academic performance through guidance as earlier as possible.

28
Gantt Chart

29
Q & A

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google