Data Mining NATE BUTLER, BRENT DAVIS, BROCK NOLAN, AND NICK THORNHILL.

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

Chapter 1 Business Driven Technology
Supporting End-User Access
Introduction BIM. Objectives Nature of Data Mining Data Mining Tools Ethics Online Survey Techniques Interpret Data.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Week 9 Data Mining System (Knowledge Data Discovery)
Data Mining Jessica Jackson Kimberli Klein Kevin Wood.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Data Mining By Archana Ketkar.
Data Mining Ketaki Borkar CS157A November 29, 2007.
Data Mining – Intro.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
Data mining By Aung Oo.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
Data Mining Techniques
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Data Mining Chun-Hung Chou
Understanding Data Analytics and Data Mining Introduction.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
IBM Start Now Business Intelligence Solutions. Agenda Overview of BI Who will buy and why Start Now BI solution Benefit to customer.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Succeeding with Technology Database Systems Basic Data Management Concepts Organizing Data in a Database Database Management Systems Using Database Systems.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Data Mining By : Tung, Sze Ming ( Leo ) CS 157B. Definition A class of database application that analyze data in a database using tools which look for.
Chapter 1 Business Driven Technology MANGT 366 Information Technology for Business Chapter 1: Management Information Systems: Business Driven MIS.
BUSINESS DRIVEN TECHNOLOGY
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
1 Topics about Data Warehouses What is a data warehouse? How does a data warehouse differ from a transaction processing database? What are the characteristics.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Why BI….? Most companies collect a large amount of data from their business operations. To keep track of that information, a business and would need to.
Advanced Database Concepts
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
Business Intelligence Overview. What is Business Intelligence? Business Intelligence is the processes, technologies, and tools that help us change data.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Introduction BIM Data Mining.
Presenter Date | Location
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
NATE BUTLER, BRENT DAVIS, BROCK NOLAN, AND NICK THORNHILL
Supporting End-User Access
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Data Mining NATE BUTLER, BRENT DAVIS, BROCK NOLAN, AND NICK THORNHILL

Outline ● Data Mining Concept ● Brief History, Basic Understanding, Relationships, Capabilities ● Data Mining and OLAP ● OLAP Cubes, Visualization, Simple Processes ● Data Mining Process ● The can and cannots, problem definition, preparation, and deployment. ● Data Mining with SAP ● Overview of SAP, models, ABC Classification, and Decision Trees. ● Open Source Data Mining Tools ● Various tool examples, models, data mining misconceptions, and your life, their data. ● Data Mining with SQL ● Features involved with SQL, querying, model testing, etc.

What is Data Mining? ● Exploration and Analysis of massive amounts of data ● Summarizes large data into useful information ● Motivated to find useful patterns for company use ● Establish Relationships and locate Trends ● Knowledge Discovery in Data (KDD)

Brief History ● Began when business data started to be stored on computers. ● Rapidly developed simultaneously with advancements in computer technology. ● 1960s: Collecting/Storage of data on computers, tapes, and disks ● 1980s : Intro. of relational databases using SQL. ● 1990s: Data warehousing is introduced. ● 1990s:” Data mining” term is introduced. ● Present Day: Continues to be driven by business wanting useful data.

Basic Understanding

Basic Data Mining Process

Data Mining Relationships ● Classes: stored data is used to locate data in predetermined groups. ● Restaurants could track customer data to find when customers visit and what they usually order. ● Clusters: the data items are grouped according to logical relationships or consumer preferences. ● Data can be mined to identify market segments or consumer affinities ● Associations: the data mined can link certain processes or habits together. ● Grocery chain found that men buy diapers on Thursdays/Saturdays, which they also tended to buy beer for the upcoming weekend. ● Sequential Patterns: anticipating the behavior patterns/trends. ● An out outdoor supplier could predict that if sleeping bags and hiking shoes are purchased then a backpack is likely to be also in the same group of items.

Business Perspective of Data Mining ● Strong Consumer Focus ● Retail, Financial, Communication, and Marketing Organizations ● Companies look to indications internal and external factors ● Internal: ● Price ● Product Positioning ● Staff skills ● External ● Economic indicators ● Competition ● Customer Demographics -Sales -Customer satisfaction -Corporate Profits

Data Mining and OLAP ● On-Line Analytical Processing ● Fast analysis of shared multidimensional data ● Supports data summarization, cost allocation, time series analysis, and what if analysis ● Complementary Activities ● OLAP provides multidimensional view of data, which data mining usually can not. ● Work together in tandem ● Data mining can select dimensions for a cube, create new values for the dimension, or create new values for a cube. ● OLAP can analyze data mining results at various levels of scales

Data Mining and OLAP Cubes

Data Mining Visualization

What data mining can and can’t do Can: ● Find patterns and relationships in your data ● Can discover hidden information in your data Can’t: ● Does not eliminate the need to know your business or your data ● Can not tell you the value of information of your organization

Data mining Process GIE

Problem Definition 1.Focuses on understanding the project and requirements. 2.Understanding the project objectives and requirements and converting this knowledge into a data mining problem. 3.Developing a preliminary implementation plan

Data Gathering and preparation 1. Involves data collection and exploration 2. Determining how well the data addresses the problem 3. Identify data quality problems 4. Scan for patterns in the data

Model Building and Evaluation 1. Select and apply various modeling techniques 2. Calibrate parameters to optimal values 3. Using algorithms that might require data transformation

Knowledge Deployment 1. Using Data mining with a target environment 2. Insight and actionable can be derived from data 3. Integration of data mining models within applications

Data Mining tools- SAP 1. SAP- Software, applications, And products in data processing 2. Fourth largest software company in the world 3. Business software package designed to intergrate all areas of business 4. Provides end to end solutions for financials, manufacturing, logistics, distribution 5. Shares common business information with everyone employee

Models of data mining in SAP Clustering 1. Identifies clusters of data objects identified in Transactions. 2.A cluster is a collection of data objects that are similar to one another. 3. A Good clustering method produces high quality clusters to ensure the inter cluster similarity is low and the intra cluster similarity is high.

ABC Classification This method involve Classifying your products into three categories to decide which one should be focused on. A= Oustanding Performance. B= Average Importance. C= Relatively unimportant.

Decision Trees 1. Is the most popular Predictive modeling technique since it provides rules and logic techniques that enable intelligent decision making. 2. Following the rules of a decision tree gives you a clear example of how data Flows. 3.The best use of a decision tree is Classifying existing customers records into customer segments. that behave in a particular manner.

Example of a Decision Tree

Open Source Data Mining Tools 1. Orange - data mining software that utilizes the python language built for both novice and experts 2. Weka - a java based data mining software Weka allows the use of sql databases through java database connectivity 3. Rattle Gui - a data mining GUI that uses the R statistical programming language to manipulate and display data trends 4. Apache Mahout - a collection of machine learning algorithms that use the Apache Hadoop platform 5. RapidMiner - integrated environment for machine learning, data mining, text mining, predictive analytics, and application development.

Data Mining Tools (Cont.) Orange Weka

Data Mining Misconceptions ● Data Mining has become a buzzword recently and because of this people have developed misconceptions of what Data Mining really is ● Data Mining is often referred to as the entire range of big data analytics, including collection, extraction, analysis and statistics ● This is too broad of a definition for Data Mining essentially what Data Mining does is find unknown patterns, unusual records and dependencies without a hypothesis on the analytical outcomes ● The most important objective of any data miner should be to find useful information that is easily understood from large data sets

Your Life, Their Data Companies are using data more and more as we become a connected society a few of these companies are using your data daily. ● Fitbit has started using their activity trackers as a measure of public health and selling their finding to local governments. ● Facebook has been mining user data since they launched their advertising strategy selling advertisement space with slogans like “Long term relationships with faceless customers” ● Almost every disclaimer or user agreement you agree to online has a data mining clause that companies use

Features of Data Mining with SQL ● Multiple data sources: You can use any tabular data source including spreadsheets and text files. ● Integrated data cleansing makes easy for modeling and also with retraining and updating. ● Multiple customizable algorithms: includes clustering, neural networks, decision trees, and even your own custom plug-in algorithms.

Features of Data Mining with SQL Cont. ● Model Testing Infrastructure: Test your data models using cross-validation, classification matrices, lift charts, and also scatter plots. ● Querying and drillthrough: SQL Server Data Mining provides the DMX language for integrating prediction queries into applications. You can also retrieve detailed statistics and patterns from the models, and then use case data.

Features of Data Mining with SQL Cont. ● Client tools: In addition to the development and design studios provided by SQL Server, you can use Add-ins for Excel to create, query, and browse models. Or, create custom clients, including Web services. ● Security and deployment: Provides role-based security through Analysis Services, including separate permissions for drillthrough to model and structure data. Easy deployment of models to other servers, so that users can access the patterns or perform predictions.

Video Explaining Data Mining