Presentation is loading. Please wait.

Presentation is loading. Please wait.

CISB594 – Business Intelligence Data Mining. CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the.

Similar presentations


Presentation on theme: "CISB594 – Business Intelligence Data Mining. CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the."— Presentation transcript:

1 CISB594 – Business Intelligence Data Mining

2 CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the following texts, unless stated otherwise.

3 CISB594 – Business Intelligence Objectives At the end of this lecture, you should be able to: Describe data mining, its characteristics and objectives in business Identify and explain the common algorithms used in data mining Discuss the use of data mining in different types of business Discuss the importance of data mining in understanding customers’ behaviours Discuss text and web mining CISB594 – Business Intelligence

4 What is Data Mining A process that uses statistical, mathematical, artificial intelligence and machine learning techniques to extract and identify useful information and subsequent knowledge from large database Uses sophisticated data manipulation technology Identifies useful information Deals with large databases Data Mining

5 CISB594 – Business Intelligence Data Mining Concepts and Applications Where is Data Mining in Business Intelligence?

6 CISB594 – Business Intelligence Users today will want to perform statistical and mathematical analysis such as hypothesis testing, prediction and customer scoring models A major step in managerial decision making is forecasting or estimating the results of different alternative courses of actions Such investigation cannot be done with basic OLAP and will require special tools – advanced business analytics – data mining Why do we need Data Mining

7 CISB594 – Business Intelligence Data are often buried deep within very large databases, which sometimes contain data from several years The data mining environment is usually client/server architecture or Web-based architecture Sophisticated tools are used to clean and synchronize data in order to get the best result Miners are the end users who are empowered with sophisticated tools to ask ad-hoc questions – they need not be technically equipped Miners may find an unexpected result during data mining activities and this will require creative thinking on the users’ decision making Major Characteristics of Data Mining

8 CISB594 – Business Intelligence Data Mining algorithms Fall into four broad categories: 1.Classification – Also known as supervised induction – Most common of all data mining activities – Used to analyse the historical data stored in the database and to automatically generate a model that can predict future behaviour – Application example : target marketing, quality assurance

9 CISB594 – Business Intelligence Data Mining algorithms Fall into four broad categories: 2. Clustering – Partitioning a database into segments in which the members of a segment share similar qualitiues – Before the results of clustering techniques are used, it might be necessary for an expert to interpret, modify the information – Clustering technique includes optimization, the goal is to create groups so that members within each group have maximum similarity and the members across groups have minimum similarity – Application example : Market segmentation

10 CISB594 – Business Intelligence Data Mining algorithms Fall into four broad categories: 3. Association – Establishes relationship about items that occur together in a given record – Determining associations among items that sell together – Often called market basket analysis as the primary applications is the analysis of sales transactions – Application example : Market basket analysis

11 CISB594 – Business Intelligence Data Mining algorithms Fall into four broad categories: 4. Sequence discovery – The identification of association over time – Some sequence discovery techniques keep track of elapsed time between associated events and the frequency of occurrences – Application example : Market basket analysis over time, customer life cycle analysis

12 CISB594 – Business Intelligence Types of data mining Two types – Hypothesis-driven data mining Begins with a proposition by the user, who then seeks to validate the truthfulness of the proposition – Discovery-driven data mining Finds patterns, associations, and relationships among the data in order to uncover facts that were previously unknown or not even contemplated by an organization

13 CISB594 – Business Intelligence Use in business Where data mining is beneficial (the intent in most of these examples is to identify a business opportunity and create a sustainable competitive advantage) BusinessUse MarketingPredicting which customers will respond to Internet banners, or buy a particular product and segmenting customer demographics BankingForecasting levels of bad loans, fraud in credit card usage, credit card spending pattern, new loans Retailing and sales Predicting sales, determining correct inventory levels and distribution schedules Manufacturing and production Predicting when to expect machinery failures, for what resaons, optimizing manufacturing capacity

14 CISB594 – Business Intelligence Use in business Where data mining is beneficial (the intent in most of these examples is to identify a business opportunity and create a sustainable competitive advantage) BusinessUse Government and defense Forecasting threats to national security, predicting resources consumptions HealthMapping demographics data to critical illnesses, identifying patterns and best approach for treatments AirlinesPredicting sales, determining popular routes, capture lost business, add routes and destinations BroadcastingPredicting what programs are best shown during prime time and where to slot in advertisements

15 CISB594 – Business Intelligence Data Mining in retail The process of data mining in retail has three different aspects: 1.Web analytics – Gather web statistics that track customer’s online behaviour ; hit, pages, sales, volume, and so on. This helps in adjusting a web site to meet customer needs. 2.Customer analytics – web sites interaction, transaction data from offline purchases, and demographic data. This is critical in CRM and revenue management because a better understanding allows an organizationto cluster customers into groupings 3.Optimization – Patterns can be detected and used to optimize customer interactions. For example in recommending relevant styles and complementary purchases/products to suit customer behaviour

16 CISB594 – Business Intelligence Text Mining Application of data mining to nonstructured or less structured text files. It entails the generation of meaningful numerical indices from the unstructured text and then processing these indices using various data mining algorithms Data MiningText Mining Takes advantage of the infrastructure of stored data to extract additional useful information. E.g. Applying data mining to customer database, we may discover that everyone who buys product A will also buy products B and C six months later Operates with text documents - less structured information.. E.g. Visualising relationships between documents such as policies, memos, emails, minutes of meeting etc. Organizations recognized this as one of the major sources for competitive advantage.

17 CISB594 – Business Intelligence Text Mining Example incident reports to increase the quality of service Airline industry uses text mining software to focus on key problem areas through pattern identification by accessing incident reports to increase the quality of service. – The most frequently occurring terms are identified in the incident report – Cluster/group the terms e.g the term spillage and associate with other key terms such as coffee, tea, soup, drink – Can identify incidents that might lead to trouble and help management curb the issue

18 CISB594 – Business Intelligence Text Mining How to mine text 1.Eliminate commonly used words (e.g. the, and, other). These are known as stop-words. 2.Replace words with their stems or roots (e.g. eliminate plurals and various conjugations). The terms phoned, phoning, and phones would be mapped to phone. 3.Consider synonyms and phrases. Synonyms need to be combined, e.g students and pupil need to be grouped together.

19 CISB594 – Business Intelligence Text Mining How to mine text 4. Calculate the weights of the remaining terms, looking at the frequency with which the words appear 2 common measures are used for this, term frequency factor (the actual number of times the word appears in a document) and inverse document frequency (the number of times the word appears in all document in a set) – If tf factor is large, weight increase, If idf factor is large, weight decrease – Reason: idf indicates that the terms would be a common words to the industry.

20 CISB594 – Business Intelligence Web Mining The discovery through the analysis of interesting and useful information from the web, about the web and usually using a web based tool.

21 CISB594 – Business Intelligence Types of Web Mining 1.Web content mining 1.Web content mining - extraction of useful information from Webpages. May be used to enhance search results produced by search engines 2.Web structure mining 2.Web structure mining – generating information from the links included in WebPages. Can be used to structure the display of the page. Can also identify the members of specific communities and and their roles 3.Web usage mining 3.Web usage mining – generated through web page visits, transactions and web server logs – useful for CRM, understanding user behaviour

22 Web Mining

23 CISB594 – Business Intelligence Now ask if … You are now be able to: Describe data mining, its characteristics and objectives in business Identify and explain the common algorithms used in data mining Discuss the use of data mining in different types of business Discuss the importance of data mining in understanding customers’ behaviours Discuss text and web mining CISB594 – Business Intelligence


Download ppt "CISB594 – Business Intelligence Data Mining. CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the."

Similar presentations


Ads by Google