Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mykola Pechenizkiy Ethics-aware Learning Analytics IRB and Big Data NSF Workshop, George Masson University Arlington, VA,

Similar presentations


Presentation on theme: "Mykola Pechenizkiy Ethics-aware Learning Analytics IRB and Big Data NSF Workshop, George Masson University Arlington, VA,"— Presentation transcript:

1 Mykola Pechenizkiy http://www.win.tue.nl/~mpechen/ Ethics-aware Learning Analytics IRB and Big Data NSF Workshop, George Masson University Arlington, VA, USA

2 Who I am Applied Data Mining researcher  Data scientist – Predictive analytics, evolving data, big data – Adaptive learning, concept drift, context – Web analytics, customer/student/user analytics Educational Data Mining/Learning Analytics-related: – EDM 2011, EDM 2015, LASI 2014, JEDM – Handbook of EDM – President-Elect IEDMS IRB_BD@GMU 9 Nov 2014 1Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

3 Outline Big Data opportunities with education platforms Fears of Big Data coming to schools Reconsidering priorities in developing/adopting Data-Driven Education paradigm – Ethics-awareness and trustworthiness Take-aways: where advice from IRB panels is welcome IRB_BD@GMU 9 Nov 2014 2Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

4 More ICT – More Data Sources IRB_BD@GMU 9 Nov 2014 3Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

5 Four Major Types of Learning & Kinds of Questions EDM\LA Can Assist with How to (re)organize the classes, or assessment, or placement of materials based on usage and performance data How to identify those who would benefit from provided feedback, study advice or other help; How to decide which kind of help would be most effective? How to help learners in (re-) finding useful material, done whether individually or collaboratively with peers IRB_BD@GMU 9 Nov 2014 4Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

6 Kinds of Data Being Collected Administrative data – Who follows which program, who takes which course, registers for an (interim) exam, reexams – Demographics, school grades, etc MOOC and LMS – Resource usage data – Assessment/assignements data (online tests, source code) – Forums, collaboration, feedback/help requests – Students’ evaluation of learning resources ITS, educational games, professional learning, e-Health, simulators,... Gaming, browsing, Gmail, Facebook, Twitter IRB_BD@GMU 9 Nov 2014 5Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

7 EDM\LA: Data  Approach  Knowledge Interactions data - Usage logs & contexts “Feedback” data - Opinions - Preferences - Needs Administrative data - Enrolments - Results - Payments - Graduation - Employment Descriptive data - Demographics - Characteristics Categorizing students Classification Clustering Association Analysis, Sequence mining Visual Analytics Find courses taken together or Popular (parts of) study programs Process mining Grouping similar students Goals - Identify high risk students - Predict new student application rates - Predict students retention/dropout - Course planning & scheduling - Faculty teaching load estimation - Predict demand for resources (library, cafeteria, housing) - Predict alumni donation Understanding study curricular Facilitate reasoning about the process or results via interactive data/model visualization IRB_BD@GMU 9 Nov 2014 6Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

8 Learning@Scale Potential Two central questions in DDE “Does it work?” and “Which way is better?” Ongoing research: Gaining insights via (massive) A/B testing Predictive modeling with actionable attributes – Prediction vs. persuasion vs. manipulation Predictive modeling with sensitive attributes – Ethics-aware personalization w/out discrimination IRB_BD@GMU 9 Nov 2014 7Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

9 Data Trumps Experts’ Intuition LAK, AIED & EDM: help in understanding what works and what does not, student modeling etc MOOC, ITS & L@S: A/B testing is becoming popular MOOC platforms provide support for A/B testing Example by Ken Koedinger (CMU) at Data-driven education @NIPS2013 Intuitive design can be replaced by data-driven IRB_BD@GMU 9 Nov 2014 8Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

10 Learning@Scale Potential Two central questions in DDE “Does it work?” and “Which way is better?” Some emerging research lines: Gaining insights via (massive) A/B testing Predictive modeling with actionable attributes – Prediction vs. persuasion vs. manipulation Predictive modeling with sensitive attributes – Ethics-aware personalization w/out discrimination IRB_BD@GMU 9 Nov 2014 9Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

11 If We Were Able to Look Deeper How these averages could possibly differ per Student learning style Student background Country they studied Ethnicity Gender Parents …. IRB_BD@GMU 9 Nov 2014 10Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

12 Uplift Predictors Suppose we do have data from A/B testing The control dataset – individuals on which no action was taken The treatment dataset – individuals on which an action was taken Build a model which predicts the causal influence of the action on a given individual Some students prefer a story, others – a formula, e.g. girls => story, boys => formula Challenging to learn such predictors, but feasible! IRB_BD@GMU 9 Nov 2014 11Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

13 cf. Personalized Medicine Paradigm A typical medical trial: – treatment group: gets the treatment – control group: gets placebo (or another treatment) – do a statistical test to show that the treatment is better than placebo With uplift predictors we can find out – for whom the treatment works and works best or – in case of alternative treatments – which treatment works best for whom IRB_BD@GMU 9 Nov 2014 12Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

14 Fear of Privacy Violation & Data Misuse “Many companies are looking to profit from student and teacher data that can be easily collected, stored, processed, customized, analyzed, and then ultimately resold”. Philip McRae (Alberta Teachers’ Association) corpwatch.org/img/original/google.jpg IRB_BD@GMU 9 Nov 2014 13Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

15 If We Were Able to Look Deeper How these averages could possibly differ per Student learning style Student background Country they studied Ethnicity Gender Parents …. Sensitive attributes IRB_BD@GMU 9 Nov 2014 14Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven cf. Discrimination at hiring, giving credit loan, etc

16 Fear of Predictive Analytics Are the decisions based on predictive models always ethical? (Personalized) decisions may be unfair to a certain group (race, ethnicity, gender) Are the models/decisions trustworthy? Do predictive models give guarantees? Is the accuracy high enough? Do models provide meaningful insights? Are they interpretable and transparent? “Correlation is not causation” IRB_BD@GMU 9 Nov 2014 15Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

17 Fears of Personalization “When Personalization Goes Bad” http://www.portical.org/blog/when-personalization-goes-bad “Rebirth of the Teaching Machine through the Seduction of Data Analytics: This Time It's Personal” http://www.philmcrae.com/2/post/2013/04/rebirth-of-the-teaching-maching-through-the-seduction-of-data-analytics-this-time-its-personal1.html “This time it is Personal and Dangerous” http://barbarabray.net/2013/12/30/this-time-its-personal-and-dangerous/ Pawel Kuczynski © Postcard (World’s Fair, Paris 1899) predicting what learning will be like in France in the year 2000

18 Predicting with Sensitive Attributes Paradox: we need to use personal data to control for unethical predictive analytics “Fairness through awareness” Dwork et al. “It’s Not Privacy, and it’s Not Fair” Dwork & Mulligan “Discrimination and Privacy in the Information Society” Custers et al. (Eds) Data mining for discrimination discovery Explainable vs. unethical discrimination Accuracy-discrimination tradeoff IRB_BD@GMU 9 Nov 2014 17Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

19 Take-aways ITS, MOOC, OLI – massive scale, cheap and scalable experimentation online What should be the policies on student data collection, sharing and use? Potential for data-driven education, finding out what works for students best via randomized trials What is and is not ethical? (cf. the Facebook study) Effects of persuasion are not uniform Potential and need for personalization DM can learn causal models from A/B testing data How to prevent malignant forms of DDE General guidelines for ethics-aware personalization and persuasion? IRB_BD@GMU 9 Nov 2014 18Ethics-aware Predictive Learning Analytics Mykola Pechenizkiy, TU Eindhoven

20 Thank you! Feedback, questions, collaboration ideas: m.pechenizkiy@tue.nl Staying connected: nl.linkedin.com/in/mpechen/

21 Fears of Big Data Coming to Schools personal/educational data misuse, poor predictions, bad personalization SIAT@SFU 11 Aug 2014 20Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

22 Fear of Privacy Violation & Data Misuse “Many companies are looking to profit from student and teacher data that can be easily collected, stored, processed, customized, analyzed, and then ultimately resold”. Philip McRae (Alberta Teachers’ Association) corpwatch.org/img/original/google.jpg SIAT@SFU 11 Aug 2014 21Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

23 Fear of Predictive Analytics Are the decisions based on predictive models always ethical? (Personalized) decisions may be unfair to a certain group (race, ethnicity, gender) Are the models/decisions trustworthy? Do predictive models give guarantees? Is the accuracy high enough? Do models provide meaningful insights? Are they interpretable and transparent? “Correlation is not causation” SIAT@SFU 11 Aug 2014 22Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

24 Fears of Personalization “When Personalization Goes Bad” http://www.portical.org/blog/when-personalization-goes-bad “Rebirth of the Teaching Machine through the Seduction of Data Analytics: This Time It's Personal” http://www.philmcrae.com/2/post/2013/04/rebirth-of-the-teaching-maching-through-the-seduction-of-data-analytics-this-time-its-personal1.html “This time it is Personal and Dangerous” http://barbarabray.net/2013/12/30/this-time-its-personal-and-dangerous/ Pawel Kuczynski © Postcard (World’s Fair, Paris 1899) predicting what learning will be like in France in the year 2000

25 Connections to Privacy & Ethics What is education data scientist philosophy? Is EDM always ethical? Is EDM a threat to privacy? Dangers of misuse of information Unethical decision making or personalization Will these discussions slow-down/kill the development and adoption of predictive learning analytics? SIAT@SFU 11 Aug 2014 24Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

26 Predicting with Actionable Attributes Prediction vs. manipulation; uplift predictors SIAT@SFU 11 Aug 2014 25Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

27 Data Trumps Intuition LAK, AIED & EDM: help in understanding what works and what does not, student modeling etc MOOC, ITS & L@S: A/B testing is becoming popular MOOC platforms provide support for A/B testing Example by Ken Koedinger (CMU) at Data-driven education @NIPS2013 Intuitive design can be replaced by data-driven SIAT@SFU 11 Aug 2014 26Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

28 If We Were Able to Look Deeper How these averages could possibly differ per Student learning style Student background Country they studied Ethnicity Gender …. SIAT@SFU 11 Aug 2014 27Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

29 Towards Personalized Medicine A typical medical trial: – treatment group: gets the treatment – control group: gets placebo (or another treatment) – do a statistical test to show that the treatment is better than placebo With uplift predictors we can find out – for whom the treatment works and works best or – in case of alternative treatments – which treatment works best for whom SIAT@SFU 11 Aug 2014 28Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

30 Uplift Predictors Suppose we do have data from A/B testing C: the control dataset – individuals on which no action was taken T: the treatment dataset – individuals on which an action was taken Build a model which predicts the causal influence of the action on a given individual Challenging, if we assume that there is no globally better action – Some students prefer a story, others – a formula But it is feasible SIAT@SFU 11 Aug 2014 29Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

31 Uplift Predictors: Conclusions Learn how to choose an action when there is no globally better action Clear evidence that this is feasible Demonstrated, that the effect of action is not uniform for individuals – focusing on individuals sensitive to choice of action helps to build better uplift predictors SIAT@SFU 11 Aug 2014 30Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

32 Predicting with Sensitive Attributes Discrimination-aware mining; bias-aware mining SIAT@SFU 11 Aug 2014 31Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven “Fairness through awareness” by Cynthia Dwork et al. In order to treat similar individuals similarly we must collect more data about individuals. Connections between privacy-preserving and fair predictive modeling. “It’s Not Privacy, and it’s Not Fair” Cynthia Dwork & Deirdre K. Mulligan “Discrimination and Privacy in the Information Society” Custers et al. (Eds) Springer, 2013

33 Sensitive Attributes Demographics (gender, race, income, education of parents) Proxies to demographics (home address or school location) Some (un)known artifacts of data collection – Different instances of a course – Different instructors – Different groups (locations) SIAT@SFU 11 Aug 2014 32Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

34 Predicting with Sensitive Attributes Model L population (source) Sensitive action? 1. training 2. 2. application X S X' a’ = argmax(p(y’=1)) Training: y = L (X, S) Application: use L for an unseen data y' = L (X’,S’) enforcing P(Y|X,S) = P(Y|X) labels Testing data labels y Sensitive Historical data SIAT@SFU 11 Aug 2014 33Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

35 Predicting with Sensitive Attributes Accuracy-discrimination tradeoff: – Data massaging for discrimination-free predictions (ICDM); – discrimination-aware decision trees, Bayesian classifiers, regression (DAMI, KAIS, ICDM) Explainable (ethical/legal) vs. unethical (ICDM) Data mining for discrimination discovery (TKDD) Paradox: we need to use personal data to control for unethical predictive analytics SIAT@SFU 11 Aug 2014 34Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

36 SIAT@SFU 11 Aug 2014 35Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven Predictive analytics should provide better tooling for DDE, help to eliminate Big Data fears in the changing face of modern education, and not boost these fears of the general public, educators, students and other stakeholders

37 Conclusions ITS, MOOC, OLI – massive scale, cheap and scalable experimentation online Potential for data-driven education, finding out what works for student best – DM can help to generate promising hypothesis to test Effects of interventions/persuasion are not uniform – Potential and need for personalization – DM can help to learn causal models from A/B testing data: uplift predictors Fears of (malignant forms of) DDE and DDP Ethics-aware and context-aware personalization SIAT@SFU 11 Aug 2014 36Predictive Analytics for Data-Driven Education Mykola Pechenizkiy, TU Eindhoven

38 Thank you! Feedback, questions, collaboration ideas: m.pechenizkiy@tue.nl Staying connected: nl.linkedin.com/in/mpechen/


Download ppt "Mykola Pechenizkiy Ethics-aware Learning Analytics IRB and Big Data NSF Workshop, George Masson University Arlington, VA,"

Similar presentations


Ads by Google