Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data

Slides:



Advertisements
Similar presentations
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
Advertisements

Companies can suffer numerous problems due to poor management of resources and careless decisions. In real-world decision- making, many organizations lack.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Frank Yu Australian Bureau of Statistics Unstructured Data 1.
Introduction to Data Mining with XLMiner
Introduction to Machine Learning Anjeli Singh Computer Science and Software Engineering April 28 th 2008.
Chapter 5 Data mining : A Closer Look.
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining Techniques
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
Banking on Analytics Dr A S Ramasastri Director, IDRBT.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
CS 6961: Structured Prediction Fall 2014 Course Information.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
Data Mining By Dave Maung.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Types of Information System – Decision Support Systems (DSS), and Expert Systems 07 th November 2011.
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.
APPLICATION OF DATAMINING TOOL FOR CLASSIFICATION OF ORGANIZATIONAL CHANGE EXPECTATION Şule ÖZMEN Serra YURTKORU Beril SİPAHİ.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
A joint Australian, State and Territory Government Initiative Experiences and lessons from benchmarking Older Persons Mental Health Services Dr Rod McKay.
OESAI COMPREHENSIVE LIFE INSURANCE TECHNICAL TRAINING.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Data Mining Basics. “Copyright and Terms of Service Copyright © Texas Education Agency. The materials found on this website are copyrighted © and trademarked.
OMIS 694, Big Data Analytics
Academic Year 2014 Spring Academic Year 2014 Spring.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Knowledge Discovery and Data Mining 19 th Meeting Course Name: Business Intelligence Year: 2009.
Financial Data mining and Tools CSCI 4333 Presentation Group 6 Date10th November 2003.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Clients/Faculty Advisors Dr. Eric Bartlett May01-14 Team Members David Herrick Brian Kerhin Chris Kirk Ayush Sharma Incremental Learning With Neural Networks.
Machine Learning. Definition Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational.
Academic Workload Allocation Model, or Teaching Load Database.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
Business Analytics Several odds and ends Copyright © 2016 Curt Hill.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Introduction to Machine Learning, its potential usage in network area,
Oracle Advanced Analytics
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning for Computer Security
Developing an early warning system combined with dynamic LMS data
MIS2502: Data Analytics Advanced Analytics - Introduction
Contextual Intelligence as a Driver of Services Innovation
Intro to Machine Learning
A GACP and GTMCP company
Machine Learning & Data Science
MIS5101: Data Analytics Advanced Analytics - Introduction
Baselining PMU Data to Find Patterns and Anomalies
OMIS 665, Big Data Analytics
Data Science introduction.
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders University

History of The Techniques and Methods of Learning Analytics Learning analytics draws upon techniques from a number of established fields: Statistics Artificial Intelligence Machine Learning Data mining Social Network Analysis Text Mining and Web Analytics Operational Research Information Visualization Application domains such as business intelligence, national security intelligence and learning analytics all have an interest in analysing large volumes of data from disparate data sources and are providing the business cases for the rapid growth in ‘big data’ & data analytics. Learning analytics encompasses support to both the business and teaching functions of the learning institution.

Data Types Structured data Typically stored in databases or spreadsheets, required to be managed in accordance with a standardised storage format and ontology e.g. names, place names, E.g. SATAC applications, load, enrolments, FLO usage data Unstructured data text, audio, imagery, video E.g. student email, chat rooms, questionnaire responses, lecture videos (audio & video) Different data types lend themselves to different analytical techniques. Unstructured data often requires pre- processing prior to enable structured data analysis Unstructured data analysis Text : document clustering , topic detection, entity extraction (people, places, locations, dates, times etc., sentiment analysis (+,-) Audio : speaker identification, language identification, speech to text, keyword spotting Video analysis : face recognition, object recognition, target tracking

Structured Data Analysis Descriptive statistics – sums, means, std devs, basic plotting (graphs, charts, histograms) Data visualisation – tools that enable the human to see meaningful patterns in data Machine learning - tools that enable computers to find patterns in data to perform either classification, clustering or prediction e.g. decision trees, neural networks, support vector machines, linear regression, self organising maps, k-means Predictive analytics – Algorithmic approaches (generally machine learning) for predicting key target variables of interest. Example LA projects: Identification of ‘at risk’ students - Student Success Project, Future University enrolments, topic enrolments

Data Visualisation Structured Data Unstructured Data

Advanced Data Visualisation Combining Structured & Unstructured Data Sources

Predicting Enrolments From Applications Data Aim: To predict next year’s commencing load using past 3 years of SATAC applications data. Predictions based at the applicant level – not time series based. Adopted a decision tree machine learning based approach. Input variables for each applicant included: academic performance, schooling, demographics (e.g. age, gender and postcode), information regarding each of the applicant’s preferences such as; preference number, course, institution, institution campus and a number of proximity variables. Output (target) variable : whether the student was enrolled at Flinders University at Semester 1 census.

Predicting Enrolments From Applications Data The three P’s - Prestige, Proximity & Price Proximity input variables For two given points P1= (lat1, lon1), P2 = (lat2, lon2) the haversine distance in kilometres between P1 and P2 is defined as: d(P1,P2) = ACOS(SIN(lat1)*SIN(lat2) + COS(lat1)*COS(lat2)*COS(lon2- lon1) ) * 6371 Haversine distance calculated between applicant’s primary residence and all SA major University campuses, with each value being an input into the machine learning algorithm. Two models developed, a) from 1st week in September b) from 2nd week in January. Training data consisted of 3 years of data 2011, 2012 & 2013 to predict 2014 enrolments - 25,551 training examples for September and 74,516 for January. A number of commonly used machine learning algorithms could have been used, we chose to adopt a CHAID decision tree algorithm.

Predicting Enrolments From Applications Data Results : Lift Versus Output Percentile Profiles For the September Model Training Validation Model Number Of Applicants (Predictions) Predicted Commencing Load Actual Commencing Load % Error September 8557 1394 1365 2.1 January 26457 4340 3858 12.5

Predicting Enrolments From Applications Data The strong consistency of the lift profiles between training results and test and validation results are indicative of structural patterns of behavior that appear to exist across applicants to South Australian Universities. These patterns of behaviour appear to be being captured via the rules contained within the decision trees produced during the training stage of the modeling process. Paper reporting this work accepted for presentation at the Australian Association for Institutional Research Forum in November & possible publication in the Journal of Institutional Research. If future year’s performance proves to be similar, the approach should be able to provide valuable support to the management of the applications process.

Predicting Topic Enrolments Planning services approached by School of Nursing to predict future topic enrolments to assist in resource and placement management. Primary focus on predicting topic enrolments for 2nd year undergraduate nursing topics. Largely deterministic program complicated by pre-requisites, large numbers of advance credit 2nd year commencers, relatively high percentages of part-time students and a lack of historical training data due to a major course restructure in 2013.

Predicting Topic Enrolments Similar machine learning (decision tree) approach adopted however input variables consisted only of: course code, attendance type, and previous topics passed (no student demographic or BOA information). Binary target variable - 1 did enroll in target topic, 0 did not enroll in target topic Under new program 2nd year topics being run for first time in 2014. Therefore only have 1st year 2013 students to train and test on. Test results gave promising results and a model was developed to predict topic enrolments for 2015. Predictions for all seven 2nd year nursing topics were provided and validated by the School as being consistent with their estimates. The School of Nursing have requested for the approach to become part of their standard business process in future years and discussions are underway as to how Planning Services can meet this request. School of Education, Humanities and Law have provided 12 topics of interest to assist planning services further develop the approach within a less constrained course structure.

In Conclusion Learning analytics is still in its infancy. The Student Success Project, Topic Enrolment and University Enrolment Prediction projects have demonstrated some early promise. Across the University we have the technical expertise and strong management support to progress learning analytics at Flinders. Particularly keen to work with the faculties to progress analytics in support of the teaching function. Performing research-like activities within an operational environment – looking for trailblazers without the fear of failure. We’re keen, enthusiastic and we’re here to help !