CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 Medicare Payment Standardization Modeling using SAS Enterprise Miner Michelle Roozeboom, PhD & Brian O’Donnell, PhD Buccaneer, A General Dynamics Company
Chronic Condition Warehouse Research database containing patient- centric data – 1999-current: 100% Medicare and Medicaid data linked across the continuum of care – All CCW data files linked by assignment of a unique, unidentifiable beneficiary link key (BENE_ID) for ease of analysis across data files (unique link keys replace health insurance claim (HIC) numbers
Background of Geographic Variation Database Uses the CCW as source data Created under a contract with the Policy Data Analysis Group Standardized Medicare’s payment amounts to remove geographic differences and hospital characteristic differences in payment rates for individual services Service Utilization Files are aggregated to produce the Beneficiary Service Level Files which contain one record per beneficiary for each service classification
Current Standardization Process Institutional (Part A) standardization – Determine a base rate and multiply that rate by weights determined by the particular service performed (e.g., DRG weights) – Adds information for other relevant factors, such as geographic outlier payments – Recalculate the payment amount, removing the geographic factors such as: wage index factors, medical teaching status, and Disproportionate Share Hospital (DSH) factors
Current Standardization Process Non-Institutional (e.g., Part B and hospital outpatient services) standardization – Link information on the claim with the appropriate fee schedule – Derive a national rate as opposed to applying a geographically specific rate
SAS Enterprise Miner Project Overview
Graph Explore: Review of Part A Data
Data Preparation 2008 Part A service files were used for analysis – Limited to Acute Inpatient Services – 20% claim sample was used – Actual payment was “target” variable – Included an identifier for acute services for Maryland providers
Data Preparation
Graph Explore Output Screen
Stat Explore Output Screen
Strength of Relationship with Dependent Variable
Correlation With Dependent Variable
Categorical Variables
Variability for Categorical Variables
Regression Analysis Linear Regression with actual payment as dependent variable Stepwise regression was used to fit the model Model was fitted both with and without interactions
Regression Output Screen – No Interactions
Plots of Mean Predicted Payment Compared to Actual Payment
Plots of MaximumPredicted Payment Compared to Actual Payment
Regression ANOVA table Analysis of Variance Sum of Source DF Squares Mean Square F Value Pr > F Model E <.0001 Error E Corrected Total E13 Model Fit Statistics R-Square Adj R-Sq AIC BIC SBC C(p)
Model Fit Statistics Statistics LabelTrainValidationTest Akaike's Information Criterion 16,214,149 Average Squared Error 45,520,01346,213,60845,574,862 Average Error Function 45,520,01346,213,60845,574,862 Degrees of Freedom for Error 919,453 Model Degrees of Freedom 42 Total Degrees of Freedom 919,495 Divisor for ASE 919,495689,621689,622 Error Function 41,855,424,402,17131,869,874,606,09831,429,427,255,852 Final Prediction Error 45,524,172 Maximum Absolute Error 411,742979,415311,907 Mean Square Error 45,522,09246,213,60845,574,862 Sum of Frequencies 919,495689,621689,622 Number of Estimate Weights 42 Root Average Sum of Squares 6,7476,7986,751 Root Final Prediction Error 6,747 Root Mean Squared Error 6,7476,7986,751 Schwarz's Bayesian Criterion 16,214,641 Sum of Squared Errors 41,855,424,402,17131,869,874,606,09831,429,427,255,852 Sum of Case Weights Times Freq 919,495689,621689,622
Variable Significance Type 3 Analysis of Effects Sum of Effect DF Squares F Value Pr > F CLM_IP_ADMSN_TYPE_CD E <.0001 CLM_PASS_THRU_PER_DIEM_AMT E <.0001 CLM_SRC_IP_ADMSN_CD E <.0001 CLM_UTLZTN_DAY_CNT E <.0001 IP_COINS E <.0001 IP_DED E <.0001 NCH_CLM_TYPE_CD E <.0001 NCH_DRG_OUTLIER_APRVD_PMT_AMT E <.0001 PTNT_DSCHRG_STUS_CD E <.0001 TRANSFER MD E <.0001
Regression Output Screen – Interactions Included
Regression Output – Interactions Included Analysis of Variance Sum of Source DF Squares Mean Square F Value Pr > F Model E <.0001 Error E Corrected Total E13 Model Fit Statistics R-Square Adj R-Sq AIC BIC SBC C(p)
Model Fit Statistics – Interactions Included Statistics LabelTrainValidationTest Akaike's Information Criterion 16,203,241 Average Squared Error 44,950,47045,901,18245,230,501 Average Error Function 44,950,47045,901,18245,230,501 Degrees of Freedom for Error 919,118 Model Degrees of Freedom 377 Total Degrees of Freedom 919,495 Divisor for ASE 919,495689,621689,622 Error Function 41,331,732,859,83931,654,419,084,52631,191,948,490,416 Final Prediction Error 44,987,346 Maximum Absolute Error 372,404981,944308,056 Mean Square Error 44,968,90845,901,18245,230,501 Sum of Frequencies 919,495689,621689,622 Number of Estimate Weights 377 Root Average Sum of Squares 6,7056,7756,725 Root Final Prediction Error 6,707 Root Mean Squared Error 6,7066,7756,725 Schwarz's Bayesian Criterion 16,207,664 Sum of Squared Errors 41,331,732,859,83931,654,419,084,52631,191,948,490,416 Sum of Case Weights Times Freq 919,495689,621689,622
Validation Dataset Results Data Role=VALIDATE Target Variable=ACTUAL_PMT Number of Mean Mean Percentile Observations Target Predicted
Scored Distribution
Scoring Results
SAS code Tab
Comparison Model calculated standardized payments were compared to standardized payments resulting from the current method Difference in standardized payments was calculated as Standardized Difference = Standardized Payment – Predicted Actual Payment
Summary Statistics VariableMedianMinimumMaximumMean Standard DeviationSkewnessKurtosisAbs C.V. Coefficient of Variation CLM_NON_U TLZTN_DAYS_ CNT , IP_COINS0038, , NCH_DRG_O UTLIER_APRV D_PMT_AMT00611, , , stand_diff , , , CLM_PASS_T HRU_PER_DI EM_AMT0.00 1, abs_stand_dif f2, , , , ACTUAL_PMT5, , , , IP_DED1, , Predicted_Ac tual_PMT5, , , , ,
Correlation with Difference in Standardized Payments Claim Utilization Day Count DRG Outlier Approved Payment Amount IP DeductibleIP Coinsurance
Variability of Categorical Variables
Evaluation of Standardized Differences
Conclusions and Lessons Learned A multilinear model without interactions and polynomial terms closely approximates a standardized payment based on payment rules for non-extreme payments A more accurate model may require splicing of data for high and low payments Large differences between standardized payments may be related to payments from Maryland providers, claim type
Conclusions and Lessons Learned Enterprise Miner allows for graphical evaluation of results for each step of a modeling process Different model parameters can be changed and applied quickly and easily Measurement scale for variables must accurate or results may be inaccurate or not produced
Conclusions and Lessons Learned Class variables should have a maximum of 512 categories Regression results will still be produced regardless of underlying assumptions – Assumptions should still be tested and evaluated to ensure accuracy of results
Questions