B IG D ATA A NALYTICS A Presentation by Meg Monsen, Michael Leonard, and Eric Zeng.

Slides:



Advertisements
Similar presentations
R and HDInsight in Microsoft Azure
Advertisements

© 2013 IBM Corporation October 4, 2013 IT Analytics and Big Data IBM Solutions Paul Smith (Smitty) Service Management Architect.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
1HP Confidential THE BIG DATA ECOSYSTEM AND YOU!.
© 2009 VMware Inc. All rights reserved Big Data’s Virtualization Journey Andrew Yu Sr. Director, Big Data R&D VMware.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
Mihai Pintea. 2 Agenda Hadoop and MongoDB DataDirect driver What is Big Data.
Architecting for the Internet of Things
Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
THREE ESSENTIAL FOCUSES IN MOBILE MARKETING By Eric Koeck Center website:
Big Data A big step towards innovation, competition and productivity.
Basic Marketing Research Customer Insights and Managerial Action
Business Intelligence: The Next Big Thing (Really!) John Bair CTO, Ajilitee Sep 14, 2012 Presented to TDWI St. Louis Chapter.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
USING HADOOP & HBASE TO BUILD CONTENT RELEVANCE & PERSONALIZATION Tools to build your big data application Ameya Kanitkar.
Big Data Adoption Drivers Sources: US Data: IDC 2012 Vertical IT & Communications Survey. N = 4177 LatAm Data: PRELIMINARY RESULTS from IDC Latin America.
Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
H ADOOP DB: A N A RCHITECTURAL H YBRID OF M AP R EDUCE AND DBMS T ECHNOLOGIES FOR A NALYTICAL W ORKLOADS By: Muhammad Mudassar MS-IT-8 1.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
© 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 Ecommerce Antoine Harfouche.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark Cluster Monitoring 2.
W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
An Introduction to HDInsight June 27 th,
When bet365 met Riak and discovered a true, “always on” database.
Spatial Tajo Supporting Spatial Queries on Apache Tajo Slideshare Shorten URL : goo.gl/j0VLXpgoo.gl/j0VLXp.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Big Data Analytics Platforms. Our Team NameApplication Viborov MichaelApache Spark Bordeynik YanivApache Storm Abu Jabal FerasHPCC Oun JosephGoogle BigQuery.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
LIMPOPO DEPARTMENT OF ECONOMIC DEVELOPMENT, ENVIRONMENT AND TOURISM The heartland of southern Africa – development is about people! 2015 ICT YOUTH CONFERENCE.
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Information Eastman. Business Process Skills Order to Cash, Forecasting & Budgeting, etc. Process Modeling Project Management Technical Skills.
Big Data Yuan Xue CS 292 Special topics on.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
B IG D ATA : S TORAGE, A NALYSIS AND I MPACT Justinas Bisikirskas 1.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
What is the Big Data Challenge? Organizations are seeking solutions that combine the real-time analytics capabilities of SAP HANA and accessibility to.
What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters.
B ig D ata Analysis for Page Ranking using Map/Reduce R.Renuka, R.Vidhya Priya, III B.Sc., IT, The S.F.R.College for Women, Sivakasi.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Microsoft Ignite /28/2017 6:07 PM
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
CNIT131 Internet Basics & Beginning HTML
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
SAS users meeting in Halifax
Big Data.
Hadoopla: Microsoft and the Hadoop Ecosystem
Mike Gualtieri, Principal Analyst
Department of Information Systems
Big Data Young Lee BUS 550.
Zoie Barrett and Brian Lam
Big Data Analysis in Digital Marketing
Big DATA.
Data Wrangling as the key to success with Data Lake
UNIT 6 RECENT TRENDS.
Architecture of modern data warehouse
Big Data.
Presentation transcript:

B IG D ATA A NALYTICS A Presentation by Meg Monsen, Michael Leonard, and Eric Zeng

A GENDA  Big Data Analytics and its Objectives  Financial Impact  Structured vs Unstructured Data  Users of Big Data  Relevant Technologies ( Hadoop, MongoDB)  Coding Examples  Future of Analytics

W HAT IS B IG D ATA AND WHY DOES IT MATTER ?  Defining Big Data Analytics  Examining large sets of data  Discovering patterns and trends  Data warehouses are insufficient  Purposes  Uncovering hidden needs of customers  Improve operational efficiency

B IG D ATA & O PERATIONAL E FFICIENCY  “By using big data for operations analysis, organizations can gain real-time visibility into operations, customer experience, transactions and behavior.” – IBM  Core Objectives  Gain  Analyze  Apply  Optimize

F INANCIAL I MPACT OF B IG D ATA  High cost of poor data quality  3.1 trillion to US government annually  10-25% of US business revenues  Opportunities for qualified analysts  Business Analyst: $66,000  Data Analyst: $60,000  Data Scientist: $113,000

D IMENSIONS OF B IG D ATA  Essential Characteriestics:  Volume - Data quantity  Velocity - Data Speed  Variety - Data Types

S TRUCTURED VS. U NSTRUCTURED D ATA Structured Data Represented as text Transactional data, formal reports, accounting records of sales and costs Relational databases / data warehouse SQL Unstructured Data May be textual or non-textual Mobile usage, click stream activity, social media responses, genomic data No structured database / data lake NoSQL (Not only SQL), SQL Batch Queries

I LLUSTRATIVE E XAMPLE Inventory AnalystInsurance Actuary

I NTERPRETATIONS Big Data Analytics Structured Data

U SERS OF B IG D ATA  Device manufacturers, ERP providers, consulting firms comprise 7 of top 10 users Big Data  Based on a survey conducted by Dell of large corporations in 2014…  55% now follow Big Data strategy  60% of Big Data projects involve a cloud  32% involve real-time or near real-time processing  22% use data lake  20% of projects by outside consultants

H ADOOP  Free, Java-Based programming framework  Distributes storage and processes large data sets  Started from a Google File System paper published in October 2003  Development was furthered by Apache  Named after Doug Cutting’s son’s toy elephant (logo!)

W HEN TO U SE ( AND N OT U SE ) H ADOOP YES!  Analytics  Search  Data Retention  Log File processing  Analysis of Text, Image, Audio, and Video Content  Recommendation systems like in E- Commerce Websites NO!  Low-latency or near real-time data access  Large number of small files to process  Multiple write scenarios requiring arbitrary writes between files

W HO U SES H ADOOP ?

H ADOOP F RAMEWORK  Hadoop Common: Contains all the libraries and utilities  Hadoop Distributed File System (HDFS): Storage with high bandwith  Hadoop YARN: Resource-management platform  Hadoop MapReduce: Programming Model  for data processing

HDFS

M AP R EDUCE

M AP R EDUCE E XAMPLE

MONGODB

M ONGO DB = “T HE DATABASE FOR GIANT IDEAS ”  Cross-platform document- oriented database  Open-source  “The database for giant ideas”  Founded in 2007 written to  handle specific problems with DoubleClick  Classified as NoSQL database

M ONGO DB E XAMPLE Also, we can practice! exercises/#PracticeOnline

T HE F UTURE OF B IG D ATA A NALYTICS

A NY Q UESTIONS ?