Basic Concepts in Big Data

Slides:



Advertisements
Similar presentations
An Introduction to Data Mining
Advertisements

1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence
Data Mining By Archana Ketkar.
Data Mining – Intro.
Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee.
Lesson Outline Introduction: Data Flood
Business Intelligence
Big Data A big step towards innovation, competition and productivity.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
Data Mining Knowledge Discovery: An Introduction
Data Mining Techniques
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Data Mining: Introduction. Why Data Mining? l The Explosive Growth of Data: from terabytes to petabytes –Data collection and data availability  Automated.
Business Intelligence. business intelligence is a broad category of applications and technologies for gathering, providing access to, and analyzing data.
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
Chapter 1 Introduction to Data Mining
Knowledge Discovery and Data Mining Evgueni Smirnov.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Integrating GVis, GIS and KDD for Exploring Spatio-Temporal Data Integrating GVis, GIS and KDD for Exploring Spatio-Temporal Data Monica Wachowicz Wageningen.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
ASUG Illinois-Chicago Chapter Meeting | Nov 6, 2014 ‘Big Data’: Peeling the onion of assumptions, confusion, and potential issues. Thomas McGinnis, Ph.D.
Kislaya Prasad, PhD. Artificial Intelligence Ubiquitous Sensors 2.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, 龙星计划课程 : 信息检索 Course Summary ChengXiang Zhai ( 翟成祥 ) Department of.
Advanced Database Concepts
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
OMIS 694, Big Data Analytics
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
Business intelligence systems. Data warehousing. An orderly and accessible repositery of known facts and related data used as a basis for making better.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
 What is a CRM  Uses of a CRM  What is Data Mining  Data Mining Tasks  How a CRM Utilizes Data Mining  Companies who Use CRM Data Mining.
Big data What they are and why we should care. Previous hot topics 60’s Catastrophe theory 70’s Fractals 80’s Chaos theory 90’s Data mining 00’s Machine.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Introducing Precictive Analytics
Data Mining.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Introduction Data Mining for Business Analytics (3rd ed.)
Data Mining – Intro.
Data Administration SIG
MIS2502: Data Analytics Advanced Analytics - Introduction
Techniques for Finding Patterns in Large Amounts of Data: Applications in Biology Vipin Kumar William Norris Professor and Head, Department of Computer.
Introduction C.Eng 714 Spring 2010.
Course Summary (Lecture for CS410 Intro Text Info Systems)
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Science using R, Minitab & XLMiner R, MinitabXLMiner for Forecasting © ExcelR Solutions. All Rights Reserved.
Data Visualisation with Tableau ExcelR Solutions.
Databases and Database Users
Introduction to TIMAN: Text Information Managemetn & Analysis
Big Data.
כריית מידע -- מבוא ד"ר אבי רוזנפלד.
Big Data.
Data Warehousing and Data Mining
OMIS 665, Big Data Analytics
Data Science introduction.
Big Data.
Data Science using R, Minitab & XLMiner R, MinitabXLMiner for Forecasting © ExcelR Solutions. All Rights Reserved.
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Data Mining.
Big DATA.
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Basic Concepts in Big Data ChengXiang (“Cheng”) Zhai Department of Computer Science University of Illinois at Urbana-Champaign http://www.cs.uiuc.edu/homes/czhai czhai@illinois.edu

What is “big data”? "Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization” (Gartner 2012) Complicated (intelligent) analysis of data may make a small data “appear” to be “big” Bottom line: Any data that exceeds our current capability of processing can be regarded as “big”

Why is “big data” a “big deal”? Government Obama administration announced “big data” initiative Many different big data programs launched Private Sector Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data Facebook handles 40 billion photos from its user base. Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide Science Large Synoptic Survey Telescope will generate 140 Terabyte of data every 5 days. Biomedical computation like decoding human Genome & personalized medicine Social science revolution -…

Lifecycle of Data: 4 “A”s Aggregation Scattered Data Integrated Data Acquisition Analysis Knowledge Log data Application

Computational View of Big Data Data Visualization Data Analysis Data Access Data Understanding Data Integration Formatting, Cleaning Storage Data

Big Data & Related Topics/Courses Human-Computer Interaction Data Visualization Machine Learning Databases Information Retrieval Data Analysis Data Access Data Mining Computer Vision Speech Recognition Data Understanding Data Integration Natural Language Processing Data Warehousing Formatting, Cleaning Signal Processing Many Applications! Storage Data Information Theory

Some Data Analysis Techniques Visualization Classification Predictive Modeling Time Series Clustering

Example of Analysis: Clustering & Latent Factor Analysis Group M1 Group M2 Group U1 Group U2 Movie 1 Movie 2 … Movie m User1 3.5 4 5 User2 1 User n 2

Example of Analysis: Predictive Modeling Group M1 Group M2 Group U1 Group U2 Movie 1 Movie 2 … Movie m User1 3.5 4 5 User2 1 User n 2 =? Does user2 like movie m? What rating is user2 likely going to give movie m? (Binary) Classification Regression

Some topics we’ll cover