Download presentation
Presentation is loading. Please wait.
Published byClyde Hill Modified over 9 years ago
1
Big Data
2
What is Big Data? https://www.youtube.com/watch?v=c4BwefH5Ve8 Big Data Analytics: 11 Case Histories and Success Stories https://www.youtube.com/watch?annotation_id=annotation_3535169775&f eature=iv&src_vid=c4BwefH5Ve8&v=t4wtzIuoY0w https://www.youtube.com/watch?annotation_id=annotation_3535169775&f eature=iv&src_vid=c4BwefH5Ve8&v=t4wtzIuoY0w
3
Big Data Data Size: – Gigabyte – Terabyte: Terabyte USB – Petabyte: Wal-Mart handles more than 1m customer transactions every hour at more than 2.5 petabytes – Exabyte: the amount of traffic flowing over the internet about 700 exabytes annually – Zettabyte
4
Big Data: Some Facts World’s information is doubling every two years World generated 1.8 ZB of information in 2011 Cisco predicts that by 2016 global IP traffic will reach 1.3 zettabytes There will be 19 billion networked devices by 2016 70% of this data is being generated by individuals as opposed to enterprises & organizations
5
Big Data Sources Web sites Social media Machine generated RFID Image, video, and audio Etc.
6
Big Data Challenges Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. “3Vs": – Volume: Size >= 30-50 TBs – Velocity: Processing speed – Variety: Structured: able to fit in a database table unstructured data
7
Do Companies care about Data? Not really, What they care about are Key Performance Indicators (KPIs) Some examples of KPIs are – Revenue – Profit – Revenue per customer/employee – Customer Attrition: the loss of clients or customers Big Data is only useful if it helps drive KPIs
8
Big Data to KPIs
9
Applications Text mining: deriving high-quality information from text. – text categorization, text clustering, concept/entity extraction, sentiment analysis, etc. Web mining: – Web usage mining – Web content mining Social media mining – Salesforce Radian6 Social Marketing Cloud http://www.youtube.com/watch?v=EH1dcFh_-I4
10
Hadoop HDFS: Hadoop Distributed File System O"Imagine you had a file that was larger than your PC's capacity. You could not store that file, right? Hadoop lets you store files bigger than what can be stored on one particular node or server. So you can store very, very large files. It also lets you store many, many files.“
11
Hadoop: MapReduce “rather than take the conventional step of moving data over a network to be processed by software, MapReduceuses a smarter approach tailor made for big data sets.” “…rather than move the data to the software, MapReducemoves the processing software to the data.” (InfoWeek)
12
NoSQL Database NotOnlySQL is a broad class of database management systems identified by non-adherence to the widely used relational database management system model. They are useful when working with a huge quantity of data when the data's nature does not require a relational model.
13
In-Memory Database An in-memory database is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. Main memory databases are faster than disk- optimized databases. Good for Big Data analytics. Use non-volatile memory module that retains data even when electrical power is removed.
14
SAP HANA High-Speed Analytical Appliance (HANA), uses a technique called sophisticated data compression to store data in the random access memory. HANA's performance is 10,000 times faster when compared to standard disks, which allows companies to analyze data in a matter of seconds instead of long hours. (Techopedia)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.