Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data in Education Rachel Hogue.

Similar presentations


Presentation on theme: "Big Data in Education Rachel Hogue."— Presentation transcript:

1 Big Data in Education Rachel Hogue

2 Overview Big Data and Education Communities
Why Collect Educational Data? Learning Theories eLearning What Data Can We Collect? Examples of eLearning Companies and their Use of Big Data Data Analysis Existing implementations using educational data Methods that work well for educational data MOOCdb Privacy concerns

3 Communities International Educational Data Mining Society
Founded July 2011 EDM workshop in 2005 (at Association for Advancement of Artificial Intelligence) EDM conference in 2008 Journal of Educational Data Mining (JEDM) since 2009 Society for Learning Analytics Research First conference: Learning Analytics and Knowledge (LAK) 2011 Journal of Learning Analytics, founded 2012

4 Why Collect Educational Data?

5 Why Collect Educational Data?
Personalize education Better assessment of learners Multiple dimensions: social, cognitive, emotional, meta- cognitive Multiple levels: individual, group, institutional levels To promote new scientific discoveries and to advance learning sciences Many theories; little hard data to support them Opportunity to discover new learning patterns

6 Why Collect Educational Data?
“Not only can you look at unique learning trajectories of individuals, but the sophistication of the models of learning goes up enormously.” Arthur Graesser, Editor, Journal of Educational Psychology

7 A Look Backwards Collecting educational data was highly resource-intensive and difficult to scale Much of the data that was easily collectible was purely summative in nature Getting data on learning processes and learner behaviors, in field settings, required methods like Quantitative field observations Video recordings Think-Aloud studies None of which scale easily

8 Learning Types

9 Learning Types Visual (spatial) Auditory Kinesthetic / haptic

10 Learning Theories Problem-Based Learning Anchored Instruction
Cognitive Apprenticeship Situated Learning

11 eLearning

12 eLearning WBI – Web Based Instruction Learning technology
Networking and computing technologies are used to improve educational practices

13 Massive Online Open Course
eLearning WBI – Web Based Instruction Learning technology Networking and computing technologies are used to improve educational practices MOOC Massive Online Open Course

14 eLearning

15 What Data Can We Collect?

16 What Data Can We Collect?
Administrative data - who are you? Address, name, birth date Content data – inferred properties about material Difficulty, subject Longitudinal data - data from a long period of time Grades Standardized testing results Time on task Attendance Click patterns How long a student holds a mouse pointer over a particular answer

17 What Data is Available Already?
PSLC DataShop a central repository to secure and store research data a set of analysis and reporting tools >250,000 hours of students using educational software >30 million student actions, responses & annotations Actions: entering an equation, manipulating a vector, typing a phrase, requesting help Responses: error feedback, strategic hints Annotations: correctness, time, skill/concept

18 Online Education Formats
Video Online modules Written documents Audio files Instructions for activity or task

19 CourseSmart Embeds technology directly into digital textbooks
Provides an “engagement index score”, which measures how much students are interacting with their eTextbooks (viewing pages, highlighting, writing notes, etc.). Researchers have found that that the engagement index score helps instructors to accurately predict student outcomes more than traditional measurement methods, such as class participation.

20 duoLingo Site and smartphone app to help people learn foreign languages Luis von Ahn Professor at Carnegie Mellon CAPTCHA and reCAPTCHA “twofer”

21 Data from duoLingo How long does it take someone to become proficient in a certain aspect of a language? How much practice is optimal? What is the consequence of missing a few days? There are theories about learning languages, such as the idea that adjectives should be taught before adverbs, but previously, there was little hard data to support these theories

22 Conclusions from duoLingo Data
The best way to teach a language depends on the students’ native tongue and the language they’re trying to acquire Example: Spanish -> English “it” tends to confuse and create anxiety for Spanish speakers, since the word doesn’t easily translate into their language Women do better at sports terms Men do better at cooking and food terms In Italy, women as a group learn English better than men

23 Learning Analytics Implementations
Still very few Knewton : Signals project at Purdue University: academic-analytics Ellucian Degree Works, “a comprehensive academic advising, transfer articulation, and degree audit solution that aligns students, advisors, and institutions to a common goal: helping students graduate on time.” Blackboard Analytics - x Signals – predicts which students are falling behind. As early as 2 weeks, based on student success algorithm. Sends , positive messages and suggestions on dashboard

24 Analysis Methods Prediction Structure Discovery Relationship Mining

25 Prediction Develop a model which can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables) Which students are off-task? Which students will fail the class?

26 Structure Discovery Find structure and patterns in the data that emerge “naturally” No specific target or predictor variable

27 Relationship Mining Discover relationships between variables in a data set with many variables Correlation or causation

28 MOOCdb Collaborative, online learning research Una-May O’Reilly
Ph.D., Principal Research Scientist MIT CSAIL Leader of ALFA (Any Scale Learning for All)

29 Different Formats of Data
SQL Dump Student state information XML files Course information s and Surveys JSON lines Clickstream data EdX Platform

30 Multiple Platforms and Data Control
EdX and Coursera Controlled by MIT and Stanford, separate entities

31 Data model to organize raw data streams
Unifies different platforms

32 MOOCdb Each class: Student Information Tables Observations Tables
Submissions Tables Collaboration Tables Feedback Tables

33 Benefits of MOOCdb Public, shared data model; avoid redundant work
Foster analytic consistency Engage more people

34 MOOCviz

35 MOOCviz Resource use compared by country

36 Privacy Concerns Hardcopy records were phased out in favor of district-based hard drive storage some time ago, but the advent of cloud computing has seen a trend toward the creation of third- party data silos (or clouds). Teachers and parents are concerned about privacy breaches by hackers and marketers InBloom Gates-funded nonprofit that houses student data in the cloud Closed its doors after parental protest

37 Privacy Concerns This past May, the Obama administration released an 85-page report on big data and its use in the US among consumers and businesses "Big data and other technological innovations, including new online course platforms that provide students real time feedback, promise to transform education by personalizing learning. At the same time, the federal government must ensure educational data linked to individual students gathered in school is used for educational purposes, and protect students against their data being shared or used inappropriately."

38 History of Educational Big Data Policies
1974: The Family Educational Rights and Privacy Act of 1974 (FERPA) 2000: The Children's Online Privacy Protection Act of 1998 (COPPA) 2005: New initiative granting money to states that implement Statewide Longitudinal Data Systems (SLDS) 2008: FERPA law is expanded: contracted vendors and school volunteers now have access to the data, with or without parental input 2011: FERPA law is amended once again, granting "authorized representatives" of state authorities access to student data 2011: The Shared Learning Collaborative (SLC) —​ which will later become inBloom —​ is created

39 Questions or Comments? me at with any questions.


Download ppt "Big Data in Education Rachel Hogue."

Similar presentations


Ads by Google