Presentation is loading. Please wait.

Presentation is loading. Please wait.

Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.

Similar presentations


Presentation on theme: "Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining."— Presentation transcript:

1 Business Intelligence - 2 BUS 782

2 Topics Data warehousing Data Mining

3 Data Warehouse Data warehouse is a central repositories of integrated data from one or more sources created for reporting and data analysis. – sourced from various operational systems in use in the organization, – structured in a way to specifically address the reporting and analytic requirements.

4 Example: Transaction Database Customer Order Product Has 1 M M M CID Cname City OIDODate PID Pname Price Rating SalesPerson Qty

5 Analyze Sales Data Detailed Business Data Total sales: – by product: Qty*Price of each detail line Sum (Qty*Price) Detailed business data: qty*price Total quantity sold: – By product: Sum(Qty) Detailed business data: Qty

6 Dimensions for Data Analysis: Factors relevant to the business data Analyze sales by Product Analyze sales related to Customer: – Location: Sales by City – Customer type: Sales by Rating Analyze sales related to Time: – Quarterly, monthly, yearly Sales Analyze sales related to Employee: – Sales by SalesPerson

7 Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as customers, employees, locations, products, time periods, etc. Fact table – contain detailed business data with links to dimension tables.

8 Star Schema FactTable LocationCode PeriodCode Rating PID Qty Amount Location Dimension LocationCode State City CustomerRating Dimension Rating Description Product Dimension PID Pname Category Period Dimension PeriodCode Year Quarter Can group by State, City

9 Define Location Dimension Location: – In the transaction database: City – In the data warehouse we define Location to be State, City San Francisco -> California, San Francisco Los Angeles -> California, Los Angeles – Define Location Code: California, San Francisco -> L1 California, Los Angeles -> L2

10 Define Period Dimension Period: – In the transaction database: Odate – In the data warehouse we define Period to be: Year, Quarter Odate: 11/2/2003 -> 2003, 4 Odate: 2/28/2003 -> 2003, 1 – Define Period Code: 2003, 4 -> 20034 2003, 1 -> 20031

11 The ETL Process E T L One, company- wide warehouse Periodic extraction  data is not completely current in warehouse

12 The ETL Process Capture/Extract Transform – Scrub(data cleansing),derive – Example: City -> LocationCode, State, City OrderDate -> PeriodCode, Year, Quarter Load and Index ETL = Extract, transform, and load

13 Performing Analysis Analyze sales: – by Location – By Location and Customer Type – By Location and Period – By Period and Product Pivot Table: – Drill down, roll up, reaggregation

14 Data Mining Knowledge discovery using a blend of statistical, artificial intelligence, and computer graphics techniques Goals: – Explain observed events or conditions – Explore data for new or unexpected relationships

15 Typical Data Mining Techniques Statistical regression Decision tree induction Clustering – discover subgroups Affinity – discover things with strong mutual relationships Sequence association – discover cycles of evens and behaviors Rule discovery – search for patterns and correlations Text mining (analytics)

16 Typical Data Mining Applications Profiling populations – High-value customers, credit risks, credit card fraud Analysis of business trends Target marketing Campaign effectiveness Product affinity – Identifying products that are purchased concurrently Up-selling – Identifying new products and services to sell to a customer based on critical events

17 Affinity Analysis: Market Basket Analysis Market Basket Analysis is a modeling technique based upon the theory that if you buy a certain group of items, you are more (or less) likely to buy another group of items. The set of items a customer buys is referred to as an itemset, and market basket analysis seeks to find relationships between purchases. Typically the relationship will be in the form of a rule: Example: – IF {beer, no bar meal} THEN {chips}.

18 Basket Analysis and Cross- Selling For instance, customers are very likely to purchase shampoo and conditioner together, so a retailer would not put both items on promotion at the same time. The promotion of one would likely drive sales of the other. A widely used example of cross selling on the internet with market basket analysis is Amazon.com's use of suggestions of the type: – "Customers who bought book A also bought book B", e.g.

19 Text Mining Objective: deriving high-quality information from text. – text categorization – text clustering – concept/entity extraction – sentiment analysis, etc.

20 Social Media Mining Salesforce Radian6 Social Marketing Cloud http://www.youtube.com/watch?v=EH1dcFh_-I4


Download ppt "Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining."

Similar presentations


Ads by Google