# Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran Eleventh International Conference on Fuzzy Set Theory.

## Presentation on theme: "Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran Eleventh International Conference on Fuzzy Set Theory."— Presentation transcript:

Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran P.Pazhohesh@gmail.com Eleventh International Conference on Fuzzy Set Theory and Applications (FSTA 2012) Penalized Trimmed Squares and Quadratic Mixed Integer Programming for Deleting Outliers in Fuzzy Liner Regression

Outline 1) Introduction 2) Fuzzy regression models 3) A mathematical Programming Approach 4) Quadratic mixed integer programming for penalized trimmed squares (PTS) 5) Numerical Example 6) Conclusion

The use of statistical linear regression is bounded by some strict assumptions about the given data Fuzzy regression is introduced which is an extension of the conventional regression and is used in estimating the relationships among variables. 1- Introduction

The goal of FR analysis is to find a regression model that fits all observed fuzzy data within a specified fitting criterion Two approaches of FR: 1.Minimizing fuzziness as an optimal criterion Simplicity in programming and computation Provide too wide ranges in estimation which could not give much help in application 2. Least squares of errors as a fitting criterion to minimize the total square error of the output. 1- Introduction

In fuzzy linear regression models data often contain outliers and bad influential observations. If the data are contaminated with a single or few outliers the problem of identifying such observations is not difficult. Detection of outliers can identify system faults and fraud before they escalate with potentially catastrophic consequences. 1- Introduction Mechanical faults Outliers Changes in system behavior Instrument error Human error

+ - is its central value and is the spread value. 2- Fuzzy regression models

are supposed to be non-negative, because the fuzziness in estimated intervals usually increases for larger values of independent variables The results are s scale dependent and many might equal to zero Total Vagueness of the given data should be minimize To repair this problem, replacement for sum of spreads of FR models coefficients, sum of spreads of the estimated intervals can be used as an objective function

Each H-certain estimated interval is needed to involve the corresponding H- certain observed interval. This affects in large coefficient spreads j c if any dependent variable has large spreads je or if there are outliers. 2- Fuzzy regression models

Penalized Trimmed Squares PTS: PTS is defined by minimizing a convex objective function (loss function), which is the sum of squared residuals and penalty costs for discarding bad observations. The robust estimate is obtained by the unique optimum solution of the convex mathematical formula called QMIP Assumptions: Crisp Input Crisp Output Relation between input and output = Fuzzy function 3- A mathematical programming approach

The basic idea is to insert fixed penalty costs into the loss function for possible deletion. Only observations that produce reduction larger than their penalty costs are deleted from the data set. The proposed PTS estimator minimizes: Sum of the k square residuals in the clean data Sum of the penalties for deleting the rest observations. 3- A mathematical programming approach k(Clean data) M-k (outliers) M observations

3- A mathematical programming approach + -

The above analysis leads to the following quadratic programming problem : The value cσ can be interpreted as a threshold for the allowable size of the residuals.

The constant c is well known from robust cut-off parameter, and it will be a cut-off parameter between data outlier and prediction vagueness. 2.5σ or 3σ is a reasonable threshold under Gaussian conditions. The penalty cost is defined a priory and the estimators performance is very sensitive to this penalty which regulates the robustness and the efficiency of the estimator. The term (σ)2can be interpreted as a penalty cost for deleting any observation where σ is a robust residual scale, and c is a cut-off parameter. 3- A mathematical programming approach

Construct a regression estimator that has high breakdown point combined with good efficiency. For this purpose appropriate penalties for high-leverage observations are developed : Unmask the multiple outliers Delete bad high-leverage outliers whereas keeping all of good high-leverage points 3- A mathematical programming approach

(0, 0), (1, 1)… (, ) Robust penalties =()2 > 4- Quadratic mixed integer programming for PTS

How the proposed method performs in fuzzy regression analysis in comparison with other methods ? This example has fuzzy observations only for dependent variable. Example Tanaka et al. (1989) designed an example to illustrate their regression model for dealing with the problem of crisp independent variable and fuzzy dependent variable. Diamond(1988) Kim and Bishu (1998) Savice and Pederyzc (1991) Kao. C, Chyu. C.-L., (2003) Nasrabadi et al. (2003) 5- Numerical example

IX i(y i, ei)Error in estimation Tanaka et al.DiamondKim-bishuKao chyuNasrabadi et alProposed 11(8.0,1.8)3.3502.2072.072.2172.5642.805 22(6.4,2.4)2.8503.0503.0253.0242.8132.170 33(9.5,2.6)1.5221.0921.0421.0820.7180.551 44(13.5,2.6)2.2572.8442.9022.8123.0622.073 55(13,2.4)2.4150.9500.8500.9540.6140.65 Total Error12.3910.14310.02610.0899.7718.249 In the study of Tanaka et al. (1989), three types of fuzzy regression models: Min problem, Max problem, and Conjunction problem, were discussed. For he sake of simplicity, the results of the Min problem at h = 0 is used for comparison. 5- Numerical example

New methodology for deleting outliers in liner fuzzy regression is presented which reduces the problem to a quadratic mixed integer program. The approach is shown to perform well when compared to other models in fuzzy regression literature. 6- Conclusion