Data Science and Big Data Analytics Chap1: Intro to Big Data Analytics

Slides:



Advertisements
Similar presentations
Nokia Technology Institute Natural Partner for Innovation.
Advertisements

Setting Big Data Capabilities Free How to Make Business on Big Data? Stig Torngaard, Partner Platon.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Hadoop in the Wild CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
CloudSocial Mobility Big data Social connections, mobility, cloud delivery and pervasive information are converging in a powerful way. This convergence.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Architecting for the Internet of Things
Business Intelligence System September 2013 BI.
Evolution in Coming 10 Years: What's the Future of Network? - Evolution in Coming 10 Years: What's the Future of Network? - Big Data- Big Changes in the.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee.
Chapter 3 Computer Science and the Foundation of Knowledge Model
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
Basic Marketing Research Customer Insights and Managerial Action
Rapid Mobile Development Enterprises are having a tough time keeping up with the demand for mobile apps. With these growing demands, businesses are expecting.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
© 2013 IBM Corporation Version 1.0 The New Eye Insight through Big Data and Analytics: A Case Study on Citizen Sentiment Analysis Sandipan Sarkar, Executive.
Big Data Adoption Drivers Sources: US Data: IDC 2012 Vertical IT & Communications Survey. N = 4177 LatAm Data: PRELIMINARY RESULTS from IDC Latin America.
BIGDATA AND DATASCIENCE By Sigma Analytics and Computing.
Chapter 12: Enhancing Decision Making Dr. Andrew P. Ciganek, Ph.D.
Charles Tappert Seidenberg School of CSIS, Pace University
Reference: An Overview of Business Intelligence Technology, Communications of The ACM, August VOL 54 NO.8
TOP – 3090 Emerging Technologies Social, Mobile, Cloud, Big Data Retail.
© 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 Ecommerce Antoine Harfouche.
IoT, Big Data and Emerging Technologies
Click to add text TWA New Job Types with Tivoli Workload Scheduler for Applications 8.6 TWS Education.
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
CSE 102 Introduction to Computer Engineering What is Computer Engineering?
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
© 2012 IBM Corporation Converting Big Data into Big Knowledge.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
OMIS 694, Big Data Analytics
MAR Capability Overview Deck Protean Analytics.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
D E P A R T M E N T O F COMPUTER SCIENCE AND SYSTEMS ANALYSIS SCHOOL OF ENGINEERING & APPLIED SCIENCE O X F O R D O H I O MIAMI UNIVERSITY Business Intelligence.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
FACULTY EXTERNSHIP OPPORTUNITIES IN DATA SCIENCE AND DATA ANALYTICS Facilitated by: FilAm Software Technology, Clark Freeport Zone Ecuiti, San Francisco,
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Hadoop in the Wild CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Wake Technical Community College “Wake Tech” Largest community college in NC 70,000+ students a year attending.
Data Analytics (CS40003) Introduction to Data Lecture #1
Telenor contribution to Digital Transformation of Norway
What Business Analytics Can Do For You!
CNIT131 Internet Basics & Beginning HTML
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Business Intelligence Minor
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024 Infrared LED Market shipments.
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
OMIS 665, Big Data Analytics
Big Data.
 Deep Analytical Talent  Data Savvy Professionals  Technology and Data Enablers.
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Charles Tappert Seidenberg School of CSIS, Pace University
Big DATA.
Presentation transcript:

Data Science and Big Data Analytics Chap1: Intro to Big Data Analytics Charles Tappert Seidenberg School of CSIS, Pace University

1.1 Big Data Overview Industries that gather and exploit data Credit card companies monitor purchase Good at identifying fraudulent purchases Mobile phone companies analyze calling patterns – e.g., even on rival networks Look for customers might switch providers For social networks data is primary product Intrinsic value increases as data grows

Attributes Defining Big Data Characteristics Huge volume of data Not just thousands/millions, but billions of items Complexity of data types and structures Varity of sources, formats, structures Speed of new data creation and grow High velocity, rapid ingestion, fast analysis

Sources of Big Data Deluge Mobile sensors – GPS, accelerometer, etc. Social media – 700 Facebook updates/sec in2012 Video surveillance – street cameras, stores, etc. Video rendering – processing video for display Smart grids – gather and act on information Geophysical exploration – oil, gas, etc. Medical imaging – reveals internal body structures Gene sequencing – more prevalent, less expensive, healthcare would like to predict personal illnesses

Sources of Big Data Deluge

Example: Genotyping from 23andme.com

1.1.1 Data Structures: Characteristics of Big Data

Data Structures: Characteristics of Big Data Structured – defined data type, format, structure Transactional data, OLAP cubes, RDBMS, CVS files, spreadsheets Semi-structured Text data with discernable patterns – e.g., XML data Quasi-structured Text data with erratic data formats – e.g., clickstream data Unstructured Data with no inherent structure – text docs, PDF’s, images, video

Example of Structured Data

Example of Semi-Structured Data

Example of Quasi-Structured Data visiting 3 websites adds 3 URLs to user’s log files

Example of Unstructured Data Video about Antarctica Expedition

1.1.2 Types of Data Repositories from an Analyst Perspective

1.2 State of the Practice in Analytics Business Intelligence (BI) versus Data Science Current Analytical Architecture Drivers of Big Data Emerging Big Data Ecosystem and a New Approach to Analytics

Business Drivers for Advanced Analytics

1.2.1 Business Intelligence (BI) versus Data Science

1.2.2 Current Analytical Architecture Typical Analytic Architecture

Current Analytical Architecture Data sources must be well understood EDW – Enterprise Data Warehouse From the EDW data is read by applications Data scientists get data for downstream analytics processing

1.2.3 Drivers of Big Data Data Evolution & Rise of Big Data Sources

1.2.4 Emerging Big Data Ecosystem and a New Approach to Analytics Four main groups of players Data devices Games, smartphones, computers, etc. Data collectors Phone and TV companies, Internet, Gov’t, etc. Data aggregators – make sense of data Websites, credit bureaus, media archives, etc. Data users and buyers Banks, law enforcement, marketers, employers, etc.

Emerging Big Data Ecosystem and a New Approach to Analytics

1.3 Key Roles for the New Big Data Ecosystem Deep analytical talent Advanced training in quantitative disciplines – e.g., math, statistics, machine learning Data savvy professionals Savvy but less technical than group 1 Technology and data enablers Support people – e.g., DB admins, programmers, etc.

Three Key Roles of the New Big Data Ecosystem

Three Recurring Data Scientist Activities Reframe business challenges as analytics challenges Design, implement, and deploy statistical models and data mining techniques on Big Data Develop insights that lead to actionable recommendations

Profile of Data Scientist Five Main Sets of Skills

Profile of Data Scientist Five Main Sets of Skills Quantitative skill – e.g., math, statistics Technical aptitude – e.g., software engineering, programming Skeptical mindset and critical thinking – ability to examine work critically Curious and creative – passionate about data and finding creative solutions Communicative and collaborative – can articulate ideas, can work with others

1.4 Examples of Big Data Analytics Retailer Target Uses life events: marriage, divorce, pregnancy Apache Hadoop Open source Big Data infrastructure innovation MapReduce paradigm, ideal for many projects Social Media Company LinkedIn Social network for working professionals Can graph a user’s professional network 250 million users in 2014

Data Visualization of User’s Social Network Using InMaps

Summary Big Data comes from myriad sources Social media, sensors, IoT, video surveillance, and sources only recently considered Companies are finding creative and novel ways to use Big Data Exploiting Big Data opportunities requires New data architectures New machine learning algorithms, ways of working People with new skill sets Always Review Chapter Exercises

Focus of Course Focus on quantitative disciplines – e.g., math, statistics, machine learning Provide overview of Big Data analytics In-depth study of a several key algorithms