Presentation is loading. Please wait.

Presentation is loading. Please wait.

Business Intelligence/ Decision Models Week 3 Data Preparation and Transformation.

Similar presentations


Presentation on theme: "Business Intelligence/ Decision Models Week 3 Data Preparation and Transformation."— Presentation transcript:

1 Business Intelligence/ Decision Models Week 3 Data Preparation and Transformation

2 Last Week OLTP, data warehouse repository and data mart structures (flat and relational files) Data integrity and normalization DB interrogation (SQL) for: OLAP and Reporting Migration into data mining suites

3

4

5 Time/ Cost Cumulated Productivity

6

7 Learning by association or problem solving

8 This Week CRISP ( Cross Industry Standard Procedure for Data Mining) Data preparation (import, aggregate and merge) Data transformation (for analytics)

9 CRISP-DM Phases Source SPSS Inc. 2008

10

11

12

13

14

15

16

17

18 Case Study A large telecom (XYZ PHONE) has discovered that it is losing customers at a much higher rate than in previous years. Reporting through the corporate dashboard (OLAP)has shown churn rates growing by a large margin last year.

19 Source SPSS Inc. 2008 Define Business Objectives Strategic objective definition Increase revenues by retaining more customers Related business goal identification Retain high value customers Identify process problems that need to be changed Clear success factor (metric) Decrease customer churn by 1% Cost-benefit analysis Increase revenues by $750,000 Actionable BI objectives XYZ wants to retain more customers by identifying likely churners 2 months prior and putting an action in place to retain them

20 Source SPSS Inc. 2008 Timeline Example XYZ’s project: 13 weeks 8 weeks a) business understanding and b) data preparation Involved line of business manager and data expert Included better defining high-value and churner definition 2 weeks data understanding Heavy reliance on data expert and database administrator 2 weeks modeling and evaluation Models developed by data miner and results evaluated by line of business manager 1 week deployment ? Heavy involvement of database administrator Model deployment entailed setting up a data model for monthly scoring of customer base with resulting reports feeding a mail offer

21 Source PSS Inc. 2008 Time Allocation Generally accepted industry timeline standards 50 to 70 percent data preparation 20 to 30 percent data understanding 10 to 20 percent modeling, evaluation, and business understanding 5 to 10 percent deployment

22 Data Import and Transformation

23 Lab Objectives Extract data from Customer file Transactional file Transform data into information Data preparation Aggregate data from transactional file Merge aggregate data & customer file

24 Data Import Step by Step Import files from Access or Excel Customer and Transaction files Document variables labels and value labels using the data dictionary Aggregate the transaction file by cust_id with summary data and key variables Merge Customer and aggregated transaction file using cust_id as a common key

25 Aggregating Transaction File Order _id DateCust_ id Prod_ num Amt 443310/211011231120 443410/302234143240 443511/052876432175 443611/053454143240 443711/072234223600 443811/081011254211 443911/082876534300 444011/081011143240 444111/123454322150 444211/132876512321 444311/131011412125 Cust_ id FreqDate1Date2Amt_ sum 1011410/2111/13696 2234210/3011/07840 2876311/0511/13796 3454211/0511/12380

26 Lab Objectives (Cont) Data transformation Compute customers’ length on file Compute recency of last purchase Compute frequency of purchases Compute amount spent Compute customer status Purpose CLV (Week4) RFM (Week5)

27 Data Transformation Step by Step Revisit measurement variables (nominal, ord, scale) Define date formats Auto recode nominal string variables Define missing values Calculate length on file or tenure (Date last purchase – Date first purchase) tenure Calculate time since last purchase (Date of current file – Date last purchase) Define customer status (active or lapsed)

28 Merging Customer and Transaction Summary Files Cust_ id Na- me Add- ress TypeCC 1011JeanNY1Visa 2234JohnOH1MC 2876JanetCA2Visa 3454JaneNY3Amex FreqDate1Date2Amt_ sum 410/2111/13696 210/3011/07840 311/0511/13796 211/0511/12380

29 Data Transformation Cust _ ids Na- me Add- ress TypeCC 1011Jean1/NY1/Res1/Visa 2234John2/OH1/Res2/MC 2876Janet3/CA2/Bus1/Visa 3454Jane1/NY3/DNK3/Amx FreqDte1Dte2AmtDaysRec- ency 410/2111/136962317 210/3011/07840823 311/0511/13796817 211/0511/12380718

30 Purpose of this exercise? Prepare data for next two weeks: Lifetime Customer Value RFM Analysis …


Download ppt "Business Intelligence/ Decision Models Week 3 Data Preparation and Transformation."

Similar presentations


Ads by Google