Predicting the dropouts rate of online course using LSTM method

Name: Predicting the dropouts rate of online course using LSTM method
Uploaded: 2017-08-24T12:33:34+00:00
Duration: PTM19S57
Channel: Ginger Jacobs
Description: Predicting the dropouts rate of online course using LSTM method

Predicting the dropouts rate of online course using LSTM method
Group 20 Good afternoon everyone… Team 20 The research paper we have read is... You may wonder why on earth will there be people putting such afford to predict the online course attendence. It is also my doubt at first. However after knowing the potential of online education, I am pretty convinced by the potential in the market. Please imagine, you can look at the slides of presentation online and listen to the presentation of the professor, and could pause and go back and fore the presentation with your needs. And taking the attendence online by having a work sheet to fill in and interact with the professor with an online platform. It could not only save the travelling time of students, but drametically decrease the educational spending by employing less professors and cut-spending on renting a room etc.

New era of education? If you are still not convinced, we can have a look at the private educational market in Hong Kong. The market itself worth 21 billion HKD per year and one of the largest tutorial school has undergo a 100 million IPO in A star tutor in Beacon education, also one of the largest in the market, was offered a 80 million annual pay by Modern education. The amount is tremendous and is even higher than the annual salary of Roger Federer last year. From these star tutor, we could understand the huge demand of the online tutorial market and students prefer to learn from video of a star tutor than a real one at school. What if education could move online, will it be a turning point for a new era?

Content Introduction of MOOC Setting the Scene
Recursive Neutral Network (RNN) LSTM method Prediction Results

Introduction

Recursive Neural Network
What is MOOC? New Era for education industry? Do you prefer to learn at home? No attendance mark? What is the potential problem? To begin with, MOOC is the word we need to remember, which stands for Massive open online course. It is simply puting education video online with a well organised order with interaction between learners and teachers. With the technical advance, we can take the attendence, have exam and hand in the homework online. However, there is still problem related to discipline and attitude of learning. Introduction Setting the scene Recursive Neural Network LSTM Method Results

Set the scene

Research Paper Motivation & Methodologies:
Dropout prediction, or identifying students at risk of dropping out of a course Identify students that are struggling and allow intervention before the course completes Possibly identify areas of the course which are more difficult which could be used when structuring the next year’s course Methodologies Vanilla RNN network LSTM model The motivation of this paper is to allow the website to understand its users or students more from the reaction on the website, like the sharing rate, and watching time. - it can predict the Dropout rate of a course , or identifying students at risk of dropping out of a course - It is useful to predict how many students will finish the course. Identify students that are struggling and allow intervention before the course completes From the analysis, it could also identify areas of the course which are more difficult which could be used when structuring the next year’s course Introduction Setting the scene Recursive Neural Network LSTM Method Results

The data set Two Courses are analyzed: “The Science of Gastronomy” & “Introduction to Java Programming”. A significant amount of students have never attend class The data set it used to build the model is from two online courses called “The Science of Gastronomy” & “Introduction to Java Programming”. Intro to Java have around people registered but only less than have attend the class at least once. While for Science of Gastronomy, the number of student taking the course is around 85000, and around have attended at least one lecture. It is roughly around 40%. Not a lot right. For the research, it only look in to these forthy percent of the data input and predict the dropout rate of these people. Introduction Setting the scene Recursive Neural Network LSTM Method Results

Variable selection The variables chosen for “Science for Gastronomy” These are the features that is chosen for the analyzes. We can see there are some common ones in both examples, like the number of lecture videos viewed by the student and number of time a student interact with the course forum and the number of course problem a student have worked on. The variables chosen for “Introduction to Java Programming” Introduction Setting the scene Recursive Neural Network LSTM Method Results

Definitions for dropout
Participation in the final week: whether a student will stay to the end of the lesson DEFINITION 2 Last week of engagement: whether the current week is the last week the student has activities DEFINITION 3 Participation in the next week: whether the student has activity in the next week Definitions for dropout So there are three definition that will be consider in this paper for dropping out from a course, first… The graph below shows a data input for “Science for Gastronomy” . It could help you understand the definition more. For the number on the feature row, the number simply represent the feature inputs, such as number of video viewed by a student etc, while when ever there is a data input on the feature row, a student has attended the lesson. 1 here means a student dropout from class, while 0 means vice versa. the Gastronomy lecture An example for one data input Introduction Setting the scene Recursive Neural Network LSTM Method Results

Formulation of the prediction system
With the variable chosen, we will then move on to the analysis procedure. For a specific student, we obtain her week-by-week activities in the course, denoted by an input sequence (x1; : : : ; xt), and the corresponding dropout labels forming an output sequence (y1; : : : ; yt). As the course progresses, our goal is to predict the dropout label of a student week by week by using the weekly activities of this student up to the current week, that is, the model trained and the prediction made at week k only use features up to xk. We view the dropout prediction task as a sequence labeling or classification problem [14], [3] since the activities recorded week by week form an input sequence and the correspond-ing dropout labels assigned according to the chosen dropout definition form an output sequence. While a conventional classification problem assumes that the inputs are indepen-dent and identically distributed (i.i.d.), the inputs forming the input sequence in a sequence classification task are actually dependent. For example, if a student participates actively in the early weeks of a course, it is reasonable to assume that she will continue to participate actively in the following weeks. On the contrary, if a student never shows up in the early weeks, it is unlikely to expect her to participate actively or even just show up at all in the following weeks. Introduction Setting the scene Recursive Neural Network LSTM Method Results

Recursive neural NETWORK
Predicting the dropouts rate of online course using LSTM method Recursive neural NETWORK

Introduction to Neural Network
(1:27) - (4:29) Introduction Setting the scene Recursive Neural Network LSTM Method Results

Recursive Neural Network (RNN)
Humans don’t start their thinking from scratch every second. When u read an essay, you understand each word based on your understanding of previous words. You don’t throw everything away and start thinking from scratch again. Your thoughts have persistence. Traditional neural networks can’t do this, and it seems like a major shortcoming. For example, imagine you want to classify what kind of event is happening at every point in a movie. It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones. Recurrent neural networks address this issue. They are networks with loops in them, allowing information to persist. Recurrent Neural Networks have loops. In the above diagram, a chunk of neural network, AA, looks at some input xtxt and outputs a value htht. A loop allows information to be passed from one step of the network to the next. A recurrent neural network can be thought of as multiple copies of the same network, each passing a message to a successor. This chain-like nature reveals that recurrent neural networks are intimately related to sequences and lists. Introduction Setting the scene Recursive Neural Network LSTM Method Results

Example for using RNN (language model)
For example, we would like to guess the last word of the sentence The clouds are in the ____ We try to predict the next word based on the previous ones If we are trying to predict the last word in “the clouds are in the sky,” we don’t need any further context – it’s pretty obvious the next word is going to be sky. It can recall the short term memory for RNN SKY Introduction Setting the scene Recursive Neural Network LSTM Method Results

Example of using RNN “I grew up in France… I speak fluent ____.” For another example, which is a slightly more difficult questions. The system will need to gone through a much longer data retrieve process. Knowing that “speak” is related to language is not enough but need to know that grew up in France is highly probable to the fluency in French. It may take a much longer time to process. And for more complicated version, it may required a very long time to process a quality prediction, or leading to errors in prediction. For our case, the researchers think that there result for using Vanilla RNN, a conventional method used for attendence prediction is not accurate enough and hence, he applied a new way for prediction, which is the LSTM method. In theory, RNNs are absolutely capable of handling such “long-term dependencies.” A human could carefully pick parameters for them to solve toy problems of this form. Sadly, in practice, RNNs don’t seem to be able to learn them. Language? French? English? Introduction Setting the scene Recursive Neural Network LSTM Method Results

Core Idea about LSTM Conveyor bell Structure A Gate It runs straight down the entire chain, with only some minor linear interactions. It’s very easy for information to just flow along it unchanged. - They are composed out of a sigmoid neural net layer and a pointwise multiplication operation. Introduction Setting the scene Recursive Neural Network LSTM Method Results

Introducing LSTM Network
Step 1: Forget gate layer Step 2: tanh layer The first step in our LSTM is to decide what information we’re going to throw away from the cell state. This decision is made by a sigmoid layer called the “forget gate layer.” It looks at ht−1ht−1 and xtxt, and outputs a number between 00 and 11 for each number in the cell state Ct−1Ct−1. A 11 represents “completely keep this” while a 00 represents “completely get rid of this.” Introduction Setting the scene Recursive Neural Network LSTM Method Results

Introducing LSTM network con’t (1)
Step 3: old cell state Step 4: final state Introduction Setting the scene Recursive Neural Network LSTM Method Results

LSTM Method for the report
Single node’s structure Equation for LSTM model The -each memory cell in an LSTM network has the same input and output as a hidden unit in the vanilla RNN, but has more parameters and a system of gating units that control the flow of information. MORE COMPLICATED MODEL but it can make the short term memory in Vanilla RNN model lasts longer!!! Introduction Setting the scene Recursive Neural Network LSTM Method Results

LSTM for the report con’t (1)
Structure for LSTM model You can see here the big picture of a LSTM model, which is similar to the vanilla RNN model. However, It is much more effective due to the Long Term memory for the system allowing it to be more like a human brain than Vanilla RNN. Introduction Setting the scene Recursive Neural Network LSTM Method Results

LSTM Summary Functionalities: The input gate can allow the incoming signal to alter the state of the memory cell or block it. On the other hand, the output gate can allow the state of the memory cell to have an effect on the other units or prevent it. The forget gate modulates the self-recurrent connections of the memory cell, allowing the cell to remember or forget its previous state as needed. The multiplicative gates allow the LSTM memory cells to store and access information over a longer period of time, thereby mitigating the vanishing gradient problem. Advantages: More parameters and a system of gating units that control the flow of information. Higher accuracy in prediction LSTM has been applied to various real-world problems, such as protein structure prediction [24], speech recognition [25], [26] and handwriting recognition [27], [28]. protein structure prediction [24], - speech recognition [25], [26] and handwriting recognition [27], [28]. As expected, its advantages are most pronounced for problems that require the use of a relatively long range of contextual or temporal information. Introduction Setting the scene Recursive Neural Network LSTM Method Results

RESULTS AND OTHER USES

Results Comparison We can read the graph horizontally first, the first column is the 1 definition of dropout, and second column stands for the second definition. The first row is the AUC result for the accuracy comparison of Vanilla RNN to other models for intro to Java programming, and u can see… The second row is the AUC result for different models for the Gastronomy lecture The last row is the AUC result for Java programming course with LSTM model in it, it has out perform… Introduction Setting the scene Recursive Neural Network LSTM Method Results

Thank you!

Appendix

Conventional analyzing methods - IOHMM
Structure of IOHMM model Equations for IOHMM 1 model Equations for IOHMM 2 model

Conventional analyzing methods - Vanilla RNN
Recursive Vanilla RNN Feedforward Vanilla RNN The network structure of vanilla RNN is shown in the left part of Figure 3. The recurrent connections from the hidden layer to itself help the network “memorize” the previous inputs in the internal state. It is worth noting that each node in Figure 3 actually represents a layer of network units. As such, each hidden node represents a certain number of hidden units as in the conventional neural network.

Vanilla RNN con’t (1) + Each of these nodes in the graph is actually a whole neural network as shown on the slides

Other applications for LSTM?
protein structure prediction speech recognition handwriting recognition language modeling translation image captioning

Predicting the dropouts rate of online course using LSTM method

Similar presentations

Presentation on theme: "Predicting the dropouts rate of online course using LSTM method"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Predicting the dropouts rate of online course using LSTM method

Similar presentations

Presentation on theme: "Predicting the dropouts rate of online course using LSTM method"— Presentation transcript:

Similar presentations

About project

Feedback