Boilé M., M.M. Golias, & S. Ivey. Contents Introduction Motivation Case study.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Chapter 5 Multiple Linear Regression
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
R OBERTO B ATTITI, M AURO B RUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Model Assessment, Selection and Averaging
What is the Model??? A Primer on Transportation Demand Forecasting Models Shawn Turner Theo Petritsch Keith Lovan Lisa Aultman-Hall.
Intercity Person, Passenger Car and Truck Travel Patterns Daily Highway Volumes on State Highways and Interstates Ability to Evaluate Major Changes in.
1 Graphical Diagnostic Tools for Evaluating Latent Class Models: An Application to Depression in the ECA Study Elizabeth S. Garrett Department of Biostatistics.
Visual Recognition Tutorial
Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.
1 Multiple Regression Here we add more independent variables to the regression. In this section I focus on sections 13.1, 13.2 and 13.4.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Maximum Likelihood (ML), Expectation Maximization (EM)
Chapter 11 Multiple Regression.
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?
Visual Recognition Tutorial
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Tracking with Linear Dynamic Models. Introduction Tracking is the problem of generating an inference about the motion of an object given a sequence of.
Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of Hawaii-
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Priors, Normal Models, Computing Posteriors
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
Basics of Regression Analysis. Determination of three performance measures Estimation of the effect of each factor Explanation of the variability Forecasting.
Methods Model. The TECOS model is used as forward model to simulate carbon transfer among the carbon pools (Fig.1). In the model, ecosystem is simplified.
Learning from observations
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Lecture 4 Introduction to Multiple Regression
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
Latent Class Regression Model Graphical Diagnostics Using an MCMC Estimation Procedure Elizabeth S. Garrett Scott L. Zeger Johns Hopkins University
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Tutorial I: Missing Value Analysis
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
Machine Learning in CSC 196K
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Dr. Justin Bateh. Point of Estimate the value of a single sample statistics, such as the sample mean (or the average of the sample data). Confidence Interval.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Multiple Regression.
Robert Anderson SAS JMP
BINARY LOGISTIC REGRESSION
Evaluating Classifiers
Bayesian data analysis
Ch3: Model Building through Regression
Vincent Granville, Ph.D. Co-Founder, DSC
Roberto Battiti, Mauro Brunato
How to handle missing data values
Predictive distributions
Hidden Markov Models Part 2: Algorithms
Multiple Regression.
Regression Models - Introduction
Biointelligence Laboratory, Seoul National University
Machine Learning: Lecture 6
Machine Learning: UNIT-3 CHAPTER-1
Yalchin Efendiev Texas A&M University
Regression Models - Introduction
Presentation transcript:

Boilé M., M.M. Golias, & S. Ivey

Contents Introduction Motivation Case study

Introduction Freight demand modeling & Regression techniques One of the best and worst tools we have Problems come from: Data Misleading performance measures When data is limited regression techniques cannot perform well (after all they are pattern recognition techniques) Even worse sometimes we rely on training-based measures of performance

Typical regression Input We want to predict Y We believe that a number of known inputs X can predict Y based on a function Black Box (Or Not) Run an algorithm to select which of the variables we selected are actually meaningful and what the parameters of the function Output Obtain a model and performance measures Use the model

Although not really the case we assume that X is a linear function of Y Is this assumption correct? Given our usual data availability non-linear models will not (necessarily) perform better What are the best inputs? Two mentalities: Throw in what ever you can find Use some rational Input We want to predict Y We believe that a number of known inputs X can predict Y based on a function

There are a number of algorithms for regression Most of them select some of the X’s (variable selection) Some of them add constraints to the Y’s (constraint regression) Some of them add constraints to the effect of X’s (shrinkage techniques) Black Box (Or Not) Run an algorithm to select which of the variables we selected are actually meaningful and what the parameters of the function

Two main measures of performance: What is the error of the model (R 2 )? Is the model and input significant (p-values)? When many independent variables are used, variable selection techniques can lead to models with high R 2 Some accept performance measures based on data used to train the model (not such a good idea) Some use what is called a hold out sample (more appropriate) Output Obtain a model and performance measures Use the model

Data, data, data Selection of input: we need data Performance measures: we need data Testing of the model: we need data So what can we do when we have limited data? Simulation looks like a good approach that has worked in other areas

Markov Chain Monte Carlo Simulation Typical regression linear model with selection. Close form solution using some heuristic (e.g. backward selection, forward selection) instead of going through all the possible subsets Instead of a closed form solution we can assume prior and posterior distributions for the variables (we can also do that for the parameters but lets talk about that some other time) and use simulation (more precise MCMC simulation) Why use simulation? Integrals are intractable MCMC simulation to go from the priors to the posteriors Is it better? One way to find out!

Case Study Prediction of truck volumes on state highways in New Jersey Major Assumption Truck volumes can be predicted given socioeconomic data surrounding the highway

Case study: Data Dependent dataset : 270 locations throughout NJ (long and short duration classification counts ) Long duration counts: Weight-In-Motion (WIM) locations Short duration vehicle classification counts Vehicle class 5 through 13 (FHWA classification) 34 Independent variables: Population Number of employees (11 SIC codes) Sales volume (11 SIC codes) Number of establishments (11 SIC codes)

Case Study: Traffic counts by roadway class Functional Class (FC) Counts (#Observations) A: 1,2 (Rural interstate and major arterials)31 B: 6, 7, 8, 9 (Rural minor arterials, collectors, and local)51 C: 11 (Urban interstate)29 D: 12 (Urban expressways and parkways)20 E: 14 (Urban major arterials)59 F: 16, 17, 19 (Urban minor arterials, collectors, and local)80 Table 1. Clustered Dataset by Highway FC and Count Availability

Case Study: Bandwidth of sections Uniform highway sections Major interchanges, roadway functionality, geometry Nine different bandwidths (0.25, 0.50, 0.75, 1.0, 1.25, 1.5, 2, 3 and 5 miles) Nine different models were estimated, for each FC Different models =>sensitivity with increasing size of the area

Model What do we want to achieve: 1.Select the most appropriate X’s out of a pool of candidate predictors 2.Constrain the values of Y 3.Constrain the influence of the selected X’s A priori non of the variables can explain truck volumes The depended variable can only take positive values Diffuse priors

Results Bayesian model (BRM) Stepwise linear regression (SLR) Statewide model (4-step planning model) (SWTM) Cross-validation with a 90% - 10% estimation-validation dataset split R 2 Values

Usability for Practitioner BUGS: The best thing since sliced bread!!! Its free and easy to use bsu.cam.ac.uk/bugs/ winbugs/contents.sh tml bsu.cam.ac.uk/bugs/ winbugs/contents.sh tml

Boilé M., M.M. Golias, & S. Ivey