11 Three types of information were collected: HATCO CasePrimary DatabaseThis example investigates a business-to-business case from existing customers of HATCO.The primary database consists 100 observations on 14 separate variables.Three types of information were collected:The perceptions of HATCO, 7 attributes (X1 – X7);The actual purchase outcomes, 2 specific measures (X9,X10);The characteristics of the purchasing companies, 5 characteristics (X8, X11-X14).
12 Table 2.1 Description of Database Variables (Hair et al., 1998)
13 Missing DataA missing data process is any systematic event external to the respondent (e.g. data entry errors or data collection problems) or action on the part of the respondent (such as refusal to answer) that leads to missing values.The impact of missing data is detrimental not only through its potential “hidden” biases of the results but also in its practical impact on the sample size available for analysis.
14 Understanding the missing data Ignorable missing dataRemediable missing dataExamining the pattern of missing data
15 Table 2.2 Summary Statistics of Pretest Data (Hair et al., 1998)
16 Outliers Four classes of outliers: Detecting outliers Procedural error Extraordinary event can be explainedExtraordinary observations has no explanationObservations fall within the ordinary range of values on each of the variables but are unique in their combination of values across the variables.Detecting outliersUnivariate detectionBivariate detectionMultivariate detection
17 Outliers detection Univariate detection threshold: For small samples, within ±2.5 standardized variable valuesFor larger samples, within ±3 or ± 4 standardized variable valuesBivariate detection threshold:Varying between 50 and 90 percent of the ellipse representing normal distribution.Multivariate detection:The Mahalanobis distance D2
18 Table 2.7 Identification of Univariate and Bivariate Outliers (Hair et al., 1998)
19 Fig 2. 3 Graphical Identification of Bivariate Outliers (Hair et al