CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 Medicare Payment Standardization Modeling using SAS Enterprise Miner.

Slides:



Advertisements
Similar presentations
I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Advertisements

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Statistical Tests Karen H. Hagglund, M.S.
Multiple regression analysis
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 11: Inferential methods in Regression and Correlation
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
BA 555 Practical Business Analysis
Statistics for Business and Economics
The Simple Regression Model
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Chapter Topics Types of Regression Models
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
1 Estimating non-VA Health Care Costs Todd H. Wagner.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Chapter 14 Multiple Regression Models. 2  A general additive multiple regression model, which relates a dependent variable y to k predictor variables.
User Study Evaluation Human-Computer Interaction.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Quality Control: Analysis Of Data Pawan Angra MS Division of Laboratory Systems Public Health Practice Program Office Centers for Disease Control and.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Sports Medicine Example. Background Does Sports Med Fellowship have an impact? Study Design –Compare people with fellowship to those without on ABFM exams.
Stats Methods at IC Lecture 3: Regression.
Decomposition of Sum of Squares
Correlation and Simple Linear Regression
Statistics in MSmcDESPOT
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Statistical Inference about Regression
BA 275 Quantitative Business Methods
Simple Linear Regression and Correlation
Adequacy of Linear Regression Models
15.1 The Role of Statistics in the Research Process
Adequacy of Linear Regression Models
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 Medicare Payment Standardization Modeling using SAS Enterprise Miner Michelle Roozeboom, PhD & Brian O’Donnell, PhD Buccaneer, A General Dynamics Company

Chronic Condition Warehouse Research database containing patient- centric data – 1999-current: 100% Medicare and Medicaid data linked across the continuum of care – All CCW data files linked by assignment of a unique, unidentifiable beneficiary link key (BENE_ID) for ease of analysis across data files (unique link keys replace health insurance claim (HIC) numbers

Background of Geographic Variation Database Uses the CCW as source data Created under a contract with the Policy Data Analysis Group Standardized Medicare’s payment amounts to remove geographic differences and hospital characteristic differences in payment rates for individual services Service Utilization Files are aggregated to produce the Beneficiary Service Level Files which contain one record per beneficiary for each service classification

Current Standardization Process Institutional (Part A) standardization – Determine a base rate and multiply that rate by weights determined by the particular service performed (e.g., DRG weights) – Adds information for other relevant factors, such as geographic outlier payments – Recalculate the payment amount, removing the geographic factors such as: wage index factors, medical teaching status, and Disproportionate Share Hospital (DSH) factors

Current Standardization Process Non-Institutional (e.g., Part B and hospital outpatient services) standardization – Link information on the claim with the appropriate fee schedule – Derive a national rate as opposed to applying a geographically specific rate

SAS Enterprise Miner Project Overview

Graph Explore: Review of Part A Data

Data Preparation 2008 Part A service files were used for analysis – Limited to Acute Inpatient Services – 20% claim sample was used – Actual payment was “target” variable – Included an identifier for acute services for Maryland providers

Data Preparation

Graph Explore Output Screen

Stat Explore Output Screen

Strength of Relationship with Dependent Variable

Correlation With Dependent Variable

Categorical Variables

Variability for Categorical Variables

Regression Analysis Linear Regression with actual payment as dependent variable Stepwise regression was used to fit the model Model was fitted both with and without interactions

Regression Output Screen – No Interactions

Plots of Mean Predicted Payment Compared to Actual Payment

Plots of MaximumPredicted Payment Compared to Actual Payment

Regression ANOVA table Analysis of Variance Sum of Source DF Squares Mean Square F Value Pr > F Model E <.0001 Error E Corrected Total E13 Model Fit Statistics R-Square Adj R-Sq AIC BIC SBC C(p)

Model Fit Statistics Statistics LabelTrainValidationTest Akaike's Information Criterion 16,214,149 Average Squared Error 45,520,01346,213,60845,574,862 Average Error Function 45,520,01346,213,60845,574,862 Degrees of Freedom for Error 919,453 Model Degrees of Freedom 42 Total Degrees of Freedom 919,495 Divisor for ASE 919,495689,621689,622 Error Function 41,855,424,402,17131,869,874,606,09831,429,427,255,852 Final Prediction Error 45,524,172 Maximum Absolute Error 411,742979,415311,907 Mean Square Error 45,522,09246,213,60845,574,862 Sum of Frequencies 919,495689,621689,622 Number of Estimate Weights 42 Root Average Sum of Squares 6,7476,7986,751 Root Final Prediction Error 6,747 Root Mean Squared Error 6,7476,7986,751 Schwarz's Bayesian Criterion 16,214,641 Sum of Squared Errors 41,855,424,402,17131,869,874,606,09831,429,427,255,852 Sum of Case Weights Times Freq 919,495689,621689,622

Variable Significance Type 3 Analysis of Effects Sum of Effect DF Squares F Value Pr > F CLM_IP_ADMSN_TYPE_CD E <.0001 CLM_PASS_THRU_PER_DIEM_AMT E <.0001 CLM_SRC_IP_ADMSN_CD E <.0001 CLM_UTLZTN_DAY_CNT E <.0001 IP_COINS E <.0001 IP_DED E <.0001 NCH_CLM_TYPE_CD E <.0001 NCH_DRG_OUTLIER_APRVD_PMT_AMT E <.0001 PTNT_DSCHRG_STUS_CD E <.0001 TRANSFER MD E <.0001

Regression Output Screen – Interactions Included

Regression Output – Interactions Included Analysis of Variance Sum of Source DF Squares Mean Square F Value Pr > F Model E <.0001 Error E Corrected Total E13 Model Fit Statistics R-Square Adj R-Sq AIC BIC SBC C(p)

Model Fit Statistics – Interactions Included Statistics LabelTrainValidationTest Akaike's Information Criterion 16,203,241 Average Squared Error 44,950,47045,901,18245,230,501 Average Error Function 44,950,47045,901,18245,230,501 Degrees of Freedom for Error 919,118 Model Degrees of Freedom 377 Total Degrees of Freedom 919,495 Divisor for ASE 919,495689,621689,622 Error Function 41,331,732,859,83931,654,419,084,52631,191,948,490,416 Final Prediction Error 44,987,346 Maximum Absolute Error 372,404981,944308,056 Mean Square Error 44,968,90845,901,18245,230,501 Sum of Frequencies 919,495689,621689,622 Number of Estimate Weights 377 Root Average Sum of Squares 6,7056,7756,725 Root Final Prediction Error 6,707 Root Mean Squared Error 6,7066,7756,725 Schwarz's Bayesian Criterion 16,207,664 Sum of Squared Errors 41,331,732,859,83931,654,419,084,52631,191,948,490,416 Sum of Case Weights Times Freq 919,495689,621689,622

Validation Dataset Results Data Role=VALIDATE Target Variable=ACTUAL_PMT Number of Mean Mean Percentile Observations Target Predicted

Scored Distribution

Scoring Results

SAS code Tab

Comparison Model calculated standardized payments were compared to standardized payments resulting from the current method Difference in standardized payments was calculated as Standardized Difference = Standardized Payment – Predicted Actual Payment

Summary Statistics VariableMedianMinimumMaximumMean Standard DeviationSkewnessKurtosisAbs C.V. Coefficient of Variation CLM_NON_U TLZTN_DAYS_ CNT , IP_COINS0038, , NCH_DRG_O UTLIER_APRV D_PMT_AMT00611, , , stand_diff , , , CLM_PASS_T HRU_PER_DI EM_AMT0.00 1, abs_stand_dif f2, , , , ACTUAL_PMT5, , , , IP_DED1, , Predicted_Ac tual_PMT5, , , , ,

Correlation with Difference in Standardized Payments Claim Utilization Day Count DRG Outlier Approved Payment Amount IP DeductibleIP Coinsurance

Variability of Categorical Variables

Evaluation of Standardized Differences

Conclusions and Lessons Learned A multilinear model without interactions and polynomial terms closely approximates a standardized payment based on payment rules for non-extreme payments A more accurate model may require splicing of data for high and low payments Large differences between standardized payments may be related to payments from Maryland providers, claim type

Conclusions and Lessons Learned Enterprise Miner allows for graphical evaluation of results for each step of a modeling process Different model parameters can be changed and applied quickly and easily Measurement scale for variables must accurate or results may be inaccurate or not produced

Conclusions and Lessons Learned Class variables should have a maximum of 512 categories Regression results will still be produced regardless of underlying assumptions – Assumptions should still be tested and evaluated to ensure accuracy of results

Questions