Download presentation
Presentation is loading. Please wait.
Published byJeanette Beilby Modified over 9 years ago
1
Mykola Pechenizkiy http://www.win.tue.nl/~mpechen/ Ethics-aware Learning Analytics IRB and Big Data NSF Workshop, George Masson University Arlington, VA, USA
2
Who I am Applied Data Mining researcher Data scientist – Predictive analytics, evolving data, big data – Adaptive learning, concept drift, context – Web analytics, customer/student/user analytics Educational Data Mining/Learning Analytics-related: – EDM 2011, EDM 2015, LASI 2014, JEDM – Handbook of EDM – President-Elect IEDMS IRB_BD@GMU 9 Nov 2014 1Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
3
Outline Big Data opportunities with education platforms Fears of Big Data coming to schools Reconsidering priorities in developing/adopting Data-Driven Education paradigm – Ethics-awareness and trustworthiness Take-aways: where advice from IRB panels is welcome IRB_BD@GMU 9 Nov 2014 2Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
4
More ICT – More Data Sources IRB_BD@GMU 9 Nov 2014 3Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
5
Four Major Types of Learning & Kinds of Questions EDM\LA Can Assist with How to (re)organize the classes, or assessment, or placement of materials based on usage and performance data How to identify those who would benefit from provided feedback, study advice or other help; How to decide which kind of help would be most effective? How to help learners in (re-) finding useful material, done whether individually or collaboratively with peers IRB_BD@GMU 9 Nov 2014 4Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
6
Kinds of Data Being Collected Administrative data – Who follows which program, who takes which course, registers for an (interim) exam, reexams – Demographics, school grades, etc MOOC and LMS – Resource usage data – Assessment/assignements data (online tests, source code) – Forums, collaboration, feedback/help requests – Students’ evaluation of learning resources ITS, educational games, professional learning, e-Health, simulators,... Gaming, browsing, Gmail, Facebook, Twitter IRB_BD@GMU 9 Nov 2014 5Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
7
EDM\LA: Data Approach Knowledge Interactions data - Usage logs & contexts “Feedback” data - Opinions - Preferences - Needs Administrative data - Enrolments - Results - Payments - Graduation - Employment Descriptive data - Demographics - Characteristics Categorizing students Classification Clustering Association Analysis, Sequence mining Visual Analytics Find courses taken together or Popular (parts of) study programs Process mining Grouping similar students Goals - Identify high risk students - Predict new student application rates - Predict students retention/dropout - Course planning & scheduling - Faculty teaching load estimation - Predict demand for resources (library, cafeteria, housing) - Predict alumni donation Understanding study curricular Facilitate reasoning about the process or results via interactive data/model visualization IRB_BD@GMU 9 Nov 2014 6Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
8
Learning@Scale Potential Two central questions in DDE “Does it work?” and “Which way is better?” Ongoing research: Gaining insights via (massive) A/B testing Predictive modeling with actionable attributes – Prediction vs. persuasion vs. manipulation Predictive modeling with sensitive attributes – Ethics-aware personalization w/out discrimination IRB_BD@GMU 9 Nov 2014 7Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
9
Data Trumps Experts’ Intuition LAK, AIED & EDM: help in understanding what works and what does not, student modeling etc MOOC, ITS & L@S: A/B testing is becoming popular MOOC platforms provide support for A/B testing Example by Ken Koedinger (CMU) at Data-driven education @NIPS2013 Intuitive design can be replaced by data-driven IRB_BD@GMU 9 Nov 2014 8Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
10
Learning@Scale Potential Two central questions in DDE “Does it work?” and “Which way is better?” Some emerging research lines: Gaining insights via (massive) A/B testing Predictive modeling with actionable attributes – Prediction vs. persuasion vs. manipulation Predictive modeling with sensitive attributes – Ethics-aware personalization w/out discrimination IRB_BD@GMU 9 Nov 2014 9Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
11
If We Were Able to Look Deeper How these averages could possibly differ per Student learning style Student background Country they studied Ethnicity Gender Parents …. IRB_BD@GMU 9 Nov 2014 10Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
12
Uplift Predictors Suppose we do have data from A/B testing The control dataset – individuals on which no action was taken The treatment dataset – individuals on which an action was taken Build a model which predicts the causal influence of the action on a given individual Some students prefer a story, others – a formula, e.g. girls => story, boys => formula Challenging to learn such predictors, but feasible! IRB_BD@GMU 9 Nov 2014 11Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
13
cf. Personalized Medicine Paradigm A typical medical trial: – treatment group: gets the treatment – control group: gets placebo (or another treatment) – do a statistical test to show that the treatment is better than placebo With uplift predictors we can find out – for whom the treatment works and works best or – in case of alternative treatments – which treatment works best for whom IRB_BD@GMU 9 Nov 2014 12Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
14
Fear of Privacy Violation & Data Misuse “Many companies are looking to profit from student and teacher data that can be easily collected, stored, processed, customized, analyzed, and then ultimately resold”. Philip McRae (Alberta Teachers’ Association) corpwatch.org/img/original/google.jpg IRB_BD@GMU 9 Nov 2014 13Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
15
If We Were Able to Look Deeper How these averages could possibly differ per Student learning style Student background Country they studied Ethnicity Gender Parents …. Sensitive attributes IRB_BD@GMU 9 Nov 2014 14Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven cf. Discrimination at hiring, giving credit loan, etc
16
Fear of Predictive Analytics Are the decisions based on predictive models always ethical? (Personalized) decisions may be unfair to a certain group (race, ethnicity, gender) Are the models/decisions trustworthy? Do predictive models give guarantees? Is the accuracy high enough? Do models provide meaningful insights? Are they interpretable and transparent? “Correlation is not causation” IRB_BD@GMU 9 Nov 2014 15Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
17
Fears of Personalization “When Personalization Goes Bad” http://www.portical.org/blog/when-personalization-goes-bad “Rebirth of the Teaching Machine through the Seduction of Data Analytics: This Time It's Personal” http://www.philmcrae.com/2/post/2013/04/rebirth-of-the-teaching-maching-through-the-seduction-of-data-analytics-this-time-its-personal1.html “This time it is Personal and Dangerous” http://barbarabray.net/2013/12/30/this-time-its-personal-and-dangerous/ Pawel Kuczynski © Postcard (World’s Fair, Paris 1899) predicting what learning will be like in France in the year 2000
18
Predicting with Sensitive Attributes Paradox: we need to use personal data to control for unethical predictive analytics “Fairness through awareness” Dwork et al. “It’s Not Privacy, and it’s Not Fair” Dwork & Mulligan “Discrimination and Privacy in the Information Society” Custers et al. (Eds) Data mining for discrimination discovery Explainable vs. unethical discrimination Accuracy-discrimination tradeoff IRB_BD@GMU 9 Nov 2014 17Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
19
Take-aways ITS, MOOC, OLI – massive scale, cheap and scalable experimentation online What should be the policies on student data collection, sharing and use? Potential for data-driven education, finding out what works for students best via randomized trials What is and is not ethical? (cf. the Facebook study) Effects of persuasion are not uniform Potential and need for personalization DM can learn causal models from A/B testing data How to prevent malignant forms of DDE General guidelines for ethics-aware personalization and persuasion? IRB_BD@GMU 9 Nov 2014 18Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven
20
Thank you! Feedback, questions, collaboration ideas: m.pechenizkiy@tue.nl Staying connected: nl.linkedin.com/in/mpechen/
21
Fears of Big Data Coming to Schools personal/educational data misuse, poor predictions, bad personalization SIAT@SFU 11 Aug 2014 20Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
22
Fear of Privacy Violation & Data Misuse “Many companies are looking to profit from student and teacher data that can be easily collected, stored, processed, customized, analyzed, and then ultimately resold”. Philip McRae (Alberta Teachers’ Association) corpwatch.org/img/original/google.jpg SIAT@SFU 11 Aug 2014 21Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
23
Fear of Predictive Analytics Are the decisions based on predictive models always ethical? (Personalized) decisions may be unfair to a certain group (race, ethnicity, gender) Are the models/decisions trustworthy? Do predictive models give guarantees? Is the accuracy high enough? Do models provide meaningful insights? Are they interpretable and transparent? “Correlation is not causation” SIAT@SFU 11 Aug 2014 22Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
24
Fears of Personalization “When Personalization Goes Bad” http://www.portical.org/blog/when-personalization-goes-bad “Rebirth of the Teaching Machine through the Seduction of Data Analytics: This Time It's Personal” http://www.philmcrae.com/2/post/2013/04/rebirth-of-the-teaching-maching-through-the-seduction-of-data-analytics-this-time-its-personal1.html “This time it is Personal and Dangerous” http://barbarabray.net/2013/12/30/this-time-its-personal-and-dangerous/ Pawel Kuczynski © Postcard (World’s Fair, Paris 1899) predicting what learning will be like in France in the year 2000
25
Connections to Privacy & Ethics What is education data scientist philosophy? Is EDM always ethical? Is EDM a threat to privacy? Dangers of misuse of information Unethical decision making or personalization Will these discussions slow-down/kill the development and adoption of predictive learning analytics? SIAT@SFU 11 Aug 2014 24Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
26
Predicting with Actionable Attributes Prediction vs. manipulation; uplift predictors SIAT@SFU 11 Aug 2014 25Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
27
Data Trumps Intuition LAK, AIED & EDM: help in understanding what works and what does not, student modeling etc MOOC, ITS & L@S: A/B testing is becoming popular MOOC platforms provide support for A/B testing Example by Ken Koedinger (CMU) at Data-driven education @NIPS2013 Intuitive design can be replaced by data-driven SIAT@SFU 11 Aug 2014 26Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
28
If We Were Able to Look Deeper How these averages could possibly differ per Student learning style Student background Country they studied Ethnicity Gender …. SIAT@SFU 11 Aug 2014 27Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
29
Towards Personalized Medicine A typical medical trial: – treatment group: gets the treatment – control group: gets placebo (or another treatment) – do a statistical test to show that the treatment is better than placebo With uplift predictors we can find out – for whom the treatment works and works best or – in case of alternative treatments – which treatment works best for whom SIAT@SFU 11 Aug 2014 28Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
30
Uplift Predictors Suppose we do have data from A/B testing C: the control dataset – individuals on which no action was taken T: the treatment dataset – individuals on which an action was taken Build a model which predicts the causal influence of the action on a given individual Challenging, if we assume that there is no globally better action – Some students prefer a story, others – a formula But it is feasible SIAT@SFU 11 Aug 2014 29Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
31
Uplift Predictors: Conclusions Learn how to choose an action when there is no globally better action Clear evidence that this is feasible Demonstrated, that the effect of action is not uniform for individuals – focusing on individuals sensitive to choice of action helps to build better uplift predictors SIAT@SFU 11 Aug 2014 30Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
32
Predicting with Sensitive Attributes Discrimination-aware mining; bias-aware mining SIAT@SFU 11 Aug 2014 31Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven “Fairness through awareness” by Cynthia Dwork et al. In order to treat similar individuals similarly we must collect more data about individuals. Connections between privacy-preserving and fair predictive modeling. “It’s Not Privacy, and it’s Not Fair” Cynthia Dwork & Deirdre K. Mulligan “Discrimination and Privacy in the Information Society” Custers et al. (Eds) Springer, 2013
33
Sensitive Attributes Demographics (gender, race, income, education of parents) Proxies to demographics (home address or school location) Some (un)known artifacts of data collection – Different instances of a course – Different instructors – Different groups (locations) SIAT@SFU 11 Aug 2014 32Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
34
Predicting with Sensitive Attributes Model L population (source) Sensitive action? 1. training 2. 2. application X S X' a’ = argmax(p(y’=1)) Training: y = L (X, S) Application: use L for an unseen data y' = L (X’,S’) enforcing P(Y|X,S) = P(Y|X) labels Testing data labels y Sensitive Historical data SIAT@SFU 11 Aug 2014 33Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
35
Predicting with Sensitive Attributes Accuracy-discrimination tradeoff: – Data massaging for discrimination-free predictions (ICDM); – discrimination-aware decision trees, Bayesian classifiers, regression (DAMI, KAIS, ICDM) Explainable (ethical/legal) vs. unethical (ICDM) Data mining for discrimination discovery (TKDD) Paradox: we need to use personal data to control for unethical predictive analytics SIAT@SFU 11 Aug 2014 34Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
36
SIAT@SFU 11 Aug 2014 35Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven Predictive analytics should provide better tooling for DDE, help to eliminate Big Data fears in the changing face of modern education, and not boost these fears of the general public, educators, students and other stakeholders
37
Conclusions ITS, MOOC, OLI – massive scale, cheap and scalable experimentation online Potential for data-driven education, finding out what works for student best – DM can help to generate promising hypothesis to test Effects of interventions/persuasion are not uniform – Potential and need for personalization – DM can help to learn causal models from A/B testing data: uplift predictors Fears of (malignant forms of) DDE and DDP Ethics-aware and context-aware personalization SIAT@SFU 11 Aug 2014 36Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven
38
Thank you! Feedback, questions, collaboration ideas: m.pechenizkiy@tue.nl Staying connected: nl.linkedin.com/in/mpechen/
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.