Download presentation

Presentation is loading. Please wait.

Published byCarlee Folds Modified over 3 years ago

1
Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon

2
1. Predictive Data Mining Finding predictive relationships in data –What feature of student behavior predicts learning –Who will default on credit cards –Who will get an “A” in your course –Which HS students will do well at CMU –Do students cluster by “learning style”

3
Causal Data Mining Finding causal relationships in data –What feature of student behavior causes learning –What will happen when we make everyone take a reading quiz before each class –What will happen when we program our tutor to intervene to give hints after an error

4
Predictive Data Mining X1X2X3..XkY 11.728M..2.41 22.011F..1.10 31.917F..1.11................ N2.812M..1.80 Data Mining Search Predictive Model Y = f(X1, X2, …Xk)

5
Predictive Data Mining Data Mining Search Predictive Model Y = f(X1, X2, …Xk) Model Classes 1.Simple Regression 2.Locally Weighted Regression 3.Logistic Regression 4.Neural Nets 5.Vector Support Machines 6.Decision Trees 7.Bayes Net 8.Naïve Bayes Classifier 9.Independent Components 10.Clustering 11.Etc.

6
Predictive Data Mining Predictive Model under Constraints Y = f(X1, X2, …Xk), e.g., f Additive functions Data Mining Search

7
Predictive Data Mining Predictive Model under Constraints Y = f(X1, X2, …Xk), Or Probability Model under Constraints: P(Y | X1, X2, …, Xk), where P Gaussian, with mean 0 Data Mining Search

8
Predictive Data Mining Decision Tree Search

9
Predictive Data Mining ≠ Causal Data Mining P(Y | X1, X2, …, Xk) P(Y | X1 set, X2, …, Xk) Conditioning is not the same as intervening Teeth Slides

10
Causal Discovery Statistical Data Causal Structure Background Knowledge - X 2 before X 3 - no unmeasured common causes Statistical Inference

11
Causal Discovery Software TETRAD IV www.phil.cmu.edu/projects/tetrad

12
Full Semester Online Course in Causal & Statistical Reasoning

13
Course is tooled to record certain events: Logins, page requests, print requests, quiz attempts, quiz scores, voluntary exercises attempted, etc. Each event was associated with attributes: Time student-id Session-id

14
Printing and Voluntary Comprehension Checks: 2002 --> 2003 2002 2003

Similar presentations

OK

7.4 – Sampling Distribution Statistic: a numerical descriptive measure of a sample Parameter: a numerical descriptive measure of a population.

7.4 – Sampling Distribution Statistic: a numerical descriptive measure of a sample Parameter: a numerical descriptive measure of a population.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on natural and artificial satellites names Ppt on plane table survey Ppt on management of water resources Ppt on distance displacement speed velocity acceleration and equations of motion Ppt on tcp ip protocol suite layers Ppt on wind as alternative source of energy Ppt on crop production and management for class 8 Ppt on 21st century skills for students Ppt on council of ministers saudi Ppt on the art of war for women